3

I ask this question with regards to a grammar in Chosmky Normal Form. The definition states that the rules must be of the following forms:

  1. A $\rightarrow$ BC
  2. A $\rightarrow$ a
  3. S $\rightarrow$ $\epsilon$

And A,B and C are non-terminal symbols, B and C are different from S and a is a terminal symbol. The definition also states that the last rule can only be present if the language of the grammar accepts the empty string (source). Now if $\epsilon$ is a terminal symbol, than this is a contradiction as the second rule can be transformed to S $\rightarrow$ $\epsilon$, seeing as A must not be different from S. This leads me to conclude that $\epsilon$ is not a terminal symbol, is this correct?

SBylemans
  • 133
  • 1
  • 3

2 Answers2

7

It's true that, in general, definitions don't include the empty string in the set of "terminals", as there's no need for that (e.g. the production rules for a context-free grammar are defined as a relation $V \rightarrow (V \cup \Sigma)^*$ - the star covers all productions of the form $A \rightarrow \epsilon$; in all other contexts, it can be omitted because it has no effect on concatenation).

Note, however, that in the case of a CNF grammar, you couldn't just make a simple convention for "terminals" to include the empty string and cut down to just the first two rules, because rules of the form $A \rightarrow \epsilon$ are not allowed for all non-terminal $A$, just for $S$.

potestasity
  • 333
  • 1
  • 5
5

The empty string is not a terminal symbol. A terminal symbol is an element of the alphabet, but the empty string is not an element of the alphabet.

In fact, this is an issue that we have to address when defining the formal syntax of context-free grammars. Either we ask that $\epsilon \notin \Sigma$, or we encode $\epsilon$-rules as $A \to$ rather than as $A \to \epsilon$. There is a similar choice regarding the symbol $\to$: either we ask that $\to \notin \Sigma$, or we encode rules as pairs $(A,\alpha)$. There are of course other choices, such as allowing the "user" to choose the symbols standing for $\to$ and $\epsilon$.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514