2

I was trying to get an arbitrary linear language to its GNF, and I converted it into one where all the productions were of the form $A\to Ba, A\to aB, A\to a$.

In this question, the accepted answer shows a way of converting some productions of a linear grammar to the GNF. The issue would be, what about the transitions of the form $A\to Ba$? That would be the only issue for both my procedure and the one in the answer, and I don't see any general way of getting those transitions into GNF ones.

If I discard the chance I'm not seeing something evident, I guess it's not that easy to give an explicit GNF for linear languages, and it looks like it might change a lot depending on the structure of the particular language. What should I do next?

1 Answers1

2

As you note, the problem is with the productions of the form $A\to Ba$.

Consider a sequence of such productions $A_0\to A_1 a_1$, $A_1\to A_2 a_2$, $\dots$, $A_{n-1}\to A_n a_n$, $A_n\to bB$, where the last production is the next one that is in GreibachNF.
Thus $A_0 \Rightarrow A_1a_1 \Rightarrow^* A_n a_n\dots a_2a_1 \Rightarrow b B a_n\dots a_1 $.

We can summarize this sequence by adding a production $A_0 \to bB [{A_0A_n}]$, where $[{A_0A_n}]$ is a new nonterminal that generates the left side $a_n\dots a_2a_1$, but now as a right linear grammar, so basically in reverse of the original derivation:

$[A_0A_n] \to a_n [A_0A_{n-1}]$, $\dots$ , $[A_0A_2] \to a_2 [A_0A_1]$, $[A_0A_1] \to a_1$.

Thus $A_0 \Rightarrow bB[A_0A_n] \Rightarrow bBa_n[A_0A_{n-1}] \Rightarrow^* bBa_n\dots a_2[A_0A_1] \Rightarrow bBa_n\dots a_2a_1$

Note that I use two variables in the new $[A_0A_n]$ as the last component recalls the position in the derivation, whereas the first component stores the initial variable, which is where the new simulation must end.

I think that with some care this can be formulated as a proper construction.

NB. Note this can partly undo your earlier preprocessing. I would not split a production of the form $A\to aBb$ into two productions $A\to aX$ and $X\to Xb$; better avoid introducing left recursive productions.

Hendrik Jan
  • 31,459
  • 1
  • 54
  • 109