4

I know that, when x = y, then it's not Context Free. This is because, the first letter of y cannot be matched with first letter of x, which is at the bottom of the stack. But, a link of Show that { xy ∣ |x| = |y|, x ≠ y } is context-free claims that, when x!=y, then it's Context Free. But, how can the letters of x and y be matched on stack? Say x=abbb y=bbbb. How can we say first letters don't match?

Not necessarily restricting ourselves to determinism, will it be even possible by the Context-free Language class (Non-deterministic & Deterministic Pushdown Automata), as a whole, to generate L = {xy | |x| = |y|, x!=y}?

2 Answers2

2

There is a big difference between $\{ xy ∣ |x| = |y|, x = y \}$ and $\{ xy ∣ |x| = |y|, x \ne y \}$. In the first one, we need every symbol in $x$ to be the same as the corresponding symbol in $y$. For inequality, it suffices that at least one symbol in x be different from the corresponding symbol in $y$. The two cases are not symmetrical.

Checking that the first symbol is the same is not difficult. That can easily be achieved with a context-free grammar; we just need a word consisting of some symbol followed by a word with that same symbol in the centre:

$$\begin{align}S&\to a A\mid b B\\A&\to a \mid a A a \mid a A b \mid b A a \mid b A b\\B&\to b \mid a B a \mid a B b \mid b B a \mid b B b \\ \end{align}$$

That doesn't help us with checking that all symbols in the first half are the same as the corresponding symbols in the second half. But it does give us a way to check if some symbol in the first half is the same as the corresponding symbol in the second half:

$$\begin{align}S&\to A A\mid B B\\A&\to a \mid a A a \mid a A b \mid b A a \mid b A b\\B&\to b \mid a B a \mid a B b \mid b B a \mid b B b\\ \end{align}$$

And clearly it can easily be modified to check whether the corresponding symbols differ, giving the grammar in @Raphael's answer, linked in your question:

$$\begin{align}S&\to A B\mid B A\\A&\to a \mid a A a \mid a A b \mid b A a \mid b A b\\B&\to b \mid a B a \mid a B b \mid b B a \mid b B b\\ \end{align}$$

rici
  • 12,150
  • 22
  • 40
1

Have a good look to the answer you link to. It specifies the language that is generated using two numbers $k,\ell$. These numbers guarantee that the two parts are different without ever knowing where the middle of the string exctly was. I will try to explain.

We have to find (or better, guess) a position in $x$ such that the same position in $y$ carries another letter. Then we can write $x = x_1 a x_2$ and $y=y_1 b y_2$, where $|x_1| = |y_1|$ and $|x_2| = |y_2|$, and $a$ and $b$ are different symbols. You see that $a$ and $b$ are at the same position and $|x| = |y|$.

Note that $x_1,x_2$ and $y_1,y_2$ can be arbitrary strings, only their lengths matter, to get $a$ and $b$ at the same position in both halves. Let us denote an arbitrary string with $k$ symbols by $\langle k \rangle$, then the language of unequal halves can be written as all strings of the form $\langle k \rangle a \langle \ell \rangle \langle k \rangle b \langle \ell \rangle$ for $k,\ell \ge 0$. Written this way, the string cannot be generated by a CFG, nor accepted by a PDA. The final trick is to observe that we do not need to find the middle, just check the length of the string between $a$ and $b$.

So we look at the strings in another way, as $\langle k \rangle a \langle k \rangle \langle \ell \rangle b \langle \ell \rangle$, or $\langle k \rangle a \langle k+\ell \rangle b \langle \ell \rangle$. This still specifies the set of all strings with unequal halves, but is seen to be context-free (as in the original answer). It also shows how to accept by PDA: read arbitrary $k$ symbols, adding one to the pusdown each step, store the next letter and read the same number $k$ of symbols. Now the pushdown is empty, and we repeat with $\ell$ symbols.

Hendrik Jan
  • 31,459
  • 1
  • 54
  • 109