3

I'm trying to make sense of regular languages, operations on them, and Kleene operations.

Let's say I define a language with the alphabet {x, y}. Let's further say that I place the restriction that there can be no string in the language that contains the substring 'xx'. Thus, my language could be expressed as L = {y, xy, yx}, since that language conforms to the definition.

Could I then argue that there is no language L* since L* could contain LL? That is, can a particular but arbitrarily chosen finite language that conforms to the definition exist, but since LL can't be in L*, L* cannot exist? Or must any L necessarily omit anything preventing L* from existing?

Raphael
  • 73,212
  • 30
  • 182
  • 400
Chris F.
  • 61
  • 5

4 Answers4

4

$L^*$ always exists. ${}^*$ is an operation on languages, just like ${}^2$ is an operation on numbers – whenever you have a number $x$, there is a number $x^2$ and, similarly, whenever you have a language $L$, there is a language $L^*$. Specifically, $L^*$ is the language of all strings that can be made by concatenating zero or more strings from $L$. That's well-defined for any language $L$.

What you've observed is that $L^*$ might have different properties from $L$. If we take your example of $L = \{y,yx,xy\}$, we see that no string in $L$ has $xx$ as a substring ($L$ is not the only language with this property but that's beside the point). On the other hand, $L^*$ includes the string $yxxy$, which does have $xx$ as a substring. That doesn't mean that $L^*$ doesn't exist; it just means that it doesn't have some property that $L$ has. Well, that's no surprise because $L$ and $L^*$ are often different languages, in which case they can't have exactly the same properties.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
4

Your confusion seems to stem from the incorrect assumption that if a language $L$ has a certain property, then $L^*$ must also have that property. Consider this simpler example. Over the 1-symbol alphabet $\{a\}$ consider the property

$L$ contains no words of length more than three

Then one language with this property is $L=\{a, aaa\}$. Then by the definition of the $^*$ operator $$\begin{align} L^*&=\{\epsilon\}\cup L\cup LL\cup LLL \cup\dotsm\\ &=\{\epsilon\}\cup\{a, aaa\}\cup\{aa,aaaa, aaaaaa\}\cup\dotsm\\ &=\{\epsilon,a,aa,aaa,aaaa,aaaaa,aaaaaa,\dotsc\} \end{align}$$ In other words, given this $L$, $L^*$ consists of all finite strings of $a$s, which certainly doesn't have the property that $L$ did.

Now of course there are some properties of a language $L$ which are shared by $L^*$: being regular is one such, consisting of just the empty string is another, and consisting of only even-length strings is yet another.

Rick Decker
  • 15,016
  • 5
  • 43
  • 54
2

Simply saying that $L$ has the alphabet $\{x, y\}$ and that no string in $L$ contains the substring $xx$ doesn't define a single language - there are infinite such languages. What you call expressing $L = \{y, xy, yx\}$ is actually defining such a language $L$ with three words that happens to have the property of no strings containing $xx$. $L^*$ doesn't have this property, because it contains the word $yxxy$ (as well as infinite other words with double $x$).

I'm not sure exactly what you're trying to prove in the second part of your post. For any language $M$, the Kleene closure $M^*$ always exists; if $M$ is regular, than $M^*$ is also regular by definition. Whether it has the property of containing no words with the substring $x$ depends on the definition of $M$.

DylanSp
  • 865
  • 6
  • 15
1

$L^*$ exist. You have to see the languages as a set of words. The language $L$ is the infinite set of words formed with the symbols $x$ and $y$ that don't contain the substring $xx$. $L^*$ is the smallest superset of $L$ that contains the empty string $\epsilon$ and is closed under the string concatenation operation.

Note: You don't have to determine if $L^*$ exist. If $L$ is a regular language then by definition $L^*$ exist and it is a regular language too.

Renato Sanhueza
  • 1,345
  • 8
  • 21