8

Define $FH(L) = \{x \in \Sigma^* : \exists y \in \Sigma^* \text{ with } |x| = |y| \text{ such that } xy \in L\}$. In other words, $FH(L)$ is the set of first halves of even length strings in $L$. Given this, if $L$ is context-free, must $FH(L)$ be context-free?

Here's my attempt at a proof:

Since $L$ is a CFL, there exists a non-deterministic PDA recognizing $L$, $M = (Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)$, where $\Sigma$ is the input alphabet, $\Gamma$ is the stack alphabet, and $Z_0$ is the symbol representing the initial stack contents. Construct PDA $M'$ from $M$, with $M' = (Q', \Sigma, \Gamma, \delta', q_0', Z_0, F')$, defined as follows:

$Q' = {q_0'} \cup (Q \times \Gamma_{\varepsilon}) \times (Q \times \Gamma_{\varepsilon}) \times (Q \times \Gamma_{\varepsilon})$.

$F' = \{[(q,X),(q,X),(p,Y)] : X,Y \in \Gamma_\varepsilon \text{ and } p \in F\}$

$\delta'(q'_0, \varepsilon, \varepsilon) = \{([(q,X), (q_0,Y), (q,X)], \varepsilon) : q \in Q \text{ and } X,Y \in \Gamma_\varepsilon \} $

$\delta'([(q,X),(p,Y),(r,Z)], a, \varepsilon) = \{([(q,X),\delta(p,a,Y), \delta(r,b,Z)], \varepsilon): X,Y,Z \in \Gamma_\varepsilon\ \text{ and } b \in \Sigma\} $

The first component of a state in $Q'$ records the guessed state $q$ and does not change once it is initially recorded. The second element records what state we are in after having processed some prefix of the input x, starting from state $q_0$, and the third element records what state we are in after having processed some prefix of the guessed $y$, starting from $q$.

I am not sure if this proof works, because I am a bit confused as to what to do with the stack for $M'$.

David Smith
  • 493
  • 2
  • 8

1 Answers1

9

The intuition developed in the comments is right. The answer is NO, there is a counter-example, a CFL for which the first halves are not CFL.

$L = \{ a^m b^n c^n a^{3m} \mid m,n\ge 1 \}$, over the alphabet $\{a,b,c\}$, from the answer on our sister site.

Proof by Pumping lemma: pick $a^p b^p c^p \in \mathrm{FH}(L)$; pumping either destroys the "$b^n c^n$"- or the "first half"-property.

A slight adaptation of that language is $K = \{ a^m b^n c^n \#\# a^{3m} \mid m,n\ge 1 \}$, over the alphabet $\{a,b,c,\#\}$. We can now "force" the point where the cutting site of the first half is and get another proof technique.

Let $H = FH(K) \cap a^*b^*c^*\#$. This means we only consider first halves where the middle is exactly at the point between the two $\#$-symbols. Thus. $m+2n+1=1+3m$, or $m=n$. Thus $H=\{a^nb^nc^n \mid n\ge 1\}\#$. Now if $K$ is context-free then $H$ is context-free (via the closure property intersection by regular languages). This language is close to a standard non-context-free example $\{a^nb^nc^n \mid n\ge 1\}$. This in turn can be obtained by right quotient with $\#$ which also preserves context-freeness.

Hendrik Jan
  • 31,459
  • 1
  • 54
  • 109