4

I'm looking for intuition about when a language is regular and when it is not. For example, consider:

$$ L = \{ 0^n 1^n \mid n \geq 1 \} = \{ 01, 0011, 000111, \ldots \}$$

which is not a regular language. Intuitively it seems a very simple language, there doesn't seem to be anything complicated going on. What is the difference between $L$ and a regular language like:

$$L' = \{ w \mid w \text{ does not contain } 11 \} = \{0,10\}^*\cdot (1 \mid \varepsilon).$$

I know how to prove that $L$ is not regular, using the Pumping Lemma. Here I am looking for intuition about what makes a language regular.

Ken Li
  • 3,106
  • 3
  • 24
  • 38

4 Answers4

6

To expand on Artem's comment, note that if we have an $m$ state automaton, at each point we can remember at most $\lg m$ bits of information about the part of input we have read so far. Based on this finite amount of information about the the previous bits of the input, you should be able to end up in the right accepting/rejecting state for all possible ways that the input can continue. If this amount of information is not sufficient to answer correctly then the language is not regular.

For this language we may need $m$ bits of information to be able to correctly answer where $m$ is an arbitrary large number (and therefore can be taken to be larger than $n$ which is fixed since the same atuomaton should work for all inputs).

One way to formally show this is pumping lemma. A more general way that is harder to use is to directly show that the number of equivalence classes of strings w.r.t. the language is not finite (we say strings $x$ and $y$ belong to the same class and write $x\sim y$ if for all strings $w$, $xw \in L$ iff $yw \in L$) which is the same as saying the amount of information that the machine needs to remember is not finite (constant w.r.t. to inputs).

Kaveh
  • 22,661
  • 4
  • 53
  • 113
1

Simple rule: Regular expressions can't count. That said, sometimes you are given languages that look like they need counting, but it turns out they actually don't. An example is that language over (0, 1) which contains equal numbers of substrings 001 and 100 and it turns out you cannot possibly have two more of one than of the other.

A regular language is a language that can be defined by a regular expressions. When "regular expressions" were defined, they were intentionally defined so that the languages can be parsed by a finite state machine. "regular expressions" could have been defined differently, to be more powerful, but they were not. I'd focus on languages recognised by finite state machine.

If you have a string X, then there will be some set of strings Y such that XY is in L. If you parse using a finite state machine, in each state there will be some set of strings Y so that parsing Y from that state will end up in an accepted state. And parsing any string X from the initial state will end up in some state.

So your language can be parsed by a finite state machine if for all initial strings X, there is only a finite number of different sets of suffix strings that create a string in L.

Look at your language: After parsing nothing, you need to parse $0^n1^n$. After parsing 0, you need to parse $0^n1^{n+1}$. After parsing 00, you need to parse $0^n1^{n+2}$. And so on. So there is an infinite number of sets that would be required, and a finite state machine would need an infinite number of states.

gnasher729
  • 32,238
  • 36
  • 56
0

Another look at parsing using a finite state machine (equivalent to regular language): If you have a finite number of states, then there must be n, m such that parsing $0^n$ and $0^m$ leaves you in the same state. Parsing $0^n1^n$ leaves you in the accepted state. Since the state after parsing $0^m$ was the same, parsing $0^m1^n$ must also end in the accepted state.

gnasher729
  • 32,238
  • 36
  • 56
-3

You can not pump $0^n1^n$, the pumping lemma defines a regular language as those where given a word in the language of the form $xyz$, all expressions $xy^iz$ are also in the language.

You can't do this with any $0^n1^n$.

Since it fails the pumping lemma it's not regular.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514
Bob
  • 1