In the article Parsing Expressions by Recursive Descent by Theodore Norvell (1999) the author starts with the following grammar for arithmetic expressions:
E --> E "+" E | E "-" E | "-" E | E "*" E | E "/" E | E "^" E | "(" E ")" | v
which is quite bad, because it's ambiguous and left-recursive. So he starts from removing the left recursion from it, and his result is as such:
E --> P {B P}
P --> v | "(" E ")" | U P
B --> "+" | "-" | "*" | "/" | "^"
U --> "-"
But I can't figure out how did he get to this result. When I try to remove the left recursion myself, I'm doing it the following way:
Firs, I group together the productions which doesn't have left recursion in one group, and other (left-recursive) in another group:
E --> E "+" E | E "-" E | E "*" E | E "/" E | E "^" E // L-recursive E --> v | "(" E ")" | "-" ENext, I name them and factor for easier manipulations:
E --> E B E // L-recursive; B stands for "Binary operator" E --> P // not L-recursive; P stands for "Primary Expression" P --> v | "(" E ")" | U E // U stands for "Unary operator" B --> "+" | "-" | "*" | "/" | "^" P --> "-"Now I need to deal only with the first two productions, which are now easier to deal with.
I rewrite those first two productions by starting from the non-L-recursive production (which is simply
P, the Primary expression) and following it by the optional TailT, which I define as the rest of the original production less the first left-recursive nonterminal (that is, justB E) followed by the TailT, or which could be empty:E --> P T T --> B E T |(note the empty alternative for the tail).
These two productions I can now rewrite in EBNF like this:
E --> P {B E}which is nearly what the author get, but I have
Einstead ofPthere inside the zero-or-more repetition pattern (the Tail). The other productions I get quite the same as he have got:P --> v | "(" E ")" | U E B -> "+" | "-" | "*" | "/" | "^" U -> "-"but here too I have
Einstead ofPin the first production forP.
So, my question is: What am I missing? What algebraic transformation on the syntax I need to proceed now to get the same exact form as the autor gets? I tried substitutions for E, but it only leads me into loops. I suspect that I need somehow to substitute P for E, but I don't know any legal transformation to justify it. Maybe you know what's the last missing step?