10

The language $L = \{0^{2n} \space |\space n \ge 0 \}$ is obviously regular – for example, it matches the regular expression $(00)^*$. But the following pumping lemma argument seems to show it's not regular. What's gone wrong?

I've found a way of splitting an input $s$ as $xyz$ satisfying the requirements of the pumping lemma but it's not true that $xy^iz\in L$ for all $i$. Doesn't that mean the language isn't regular?

In more detail, the pumping lemma for regular languages says that, if a language $L$ is regular, there exists pumping length $p \ge 1$ such that any string $s\in L$ with $|s|> p$ can be written as $s = xyz$ such that:

  1. $\lvert y \rvert \ge 1$
  2. $\lvert xy \rvert \le p$
  3. $xy^iz\in L$ for all $i \ge 0$.

So, let's take $s = 0^{2p}$ and write it as $s=\epsilon\, 0 \, 0^{2p-1}$ (i.e., $x = \epsilon$, $y = 0$, $z = 0^{2p-1}$). This satisfies 1. and 2. But, taking $i=0$, we get $xy^iz = \epsilon\, 0^0\,0^{2p-1} = 0^{2p-1}$, which isn't in $L$ because its length is odd. So it looks like the language isn't regular after all.


This is intended as a reference question illustrating a common mistake in the use of the pumping lemma for regular langauges. Thanks to Ariel for spotting the issue in the original version of the question.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
flashburn
  • 1,233
  • 1
  • 12
  • 22

1 Answers1

8

The problem is in the quantifiers. The pumping lemma says that any string $s$ with $|s|\geq p$ can be written as $xyz$ such that the three properties hold. It doesn't say that every way of writing it as $xyz$ that makes the first two properties hold also makes the third one hold.

For the language $\{0^{2n}\mid n\geq 0\}$, we proceed as follows. First, note that we must have $p\geq 2$, since if $p=1$, we're forced to take $x=\epsilon$, $y=0$, $z=0^{2p-1}$ and you already showed in the question that this doesn't work. So, with $p\geq 2$, we can write $s = 0^{2p}$ as $s=\epsilon\,00\,0^{2(p-1)}$ ($x=\epsilon$, $y=00$, $z=0^{2(p-1)}$). We have $|\epsilon00| \leq p$, $|00|>1$ and $(00)^i\,0^{2(p-1)}\in L$ for all $i\geq 0$. Thus, there exists some way of decomposing the string as $xyz$ that satisfies all the properties, even though the first decomposition you thought of didn't work.

To show that a language isn't regular, you need to show that every decomposition into $xyz$ that satisfies the first two properties fails to satisfy the third one. It's not enough to just show that one decomposition doesn't work.

To understand why the pumping lemma is the way it is, it helps to think about the proof. If a language is regular, it is accepted by some DFA. That DFA has some number of states: call it $p$. By the pigeonhole principle, whenever that DFA reads a string longer than $p$, it must visit some state twice: say state $q$. Now, $x$ is the part of the input read upto (and including) the first visit to $q$, $y$ is the part read after the first visit and upto and including the second (which must be at least one character) and $z$ is the rest. But now you can see that $xz$ must be accepted: $x$ takes you from the start state to $q$ and $z$ takes you from $q$ to an accepting state. Likewise, $xy^iz$ must be accepted for any positive $i$, since each repetition of $y$ takes you from $q$ back to $q$. Note that the decomposition of the input into $x$, $y$ and $z$ is entirely determined by the automaton which is, in turn, determined (but not uniquely) by the language. So you don't get to choose the decomposition: if the langauge is regular, some decomposition exists; to show that a language is not regular, you must show that every decomposition fails.

David Richerby
  • 82,470
  • 26
  • 145
  • 239