5

I have this context-free grammar and I want to find out whether its language is finite or infinite.

S -> XY|bb  Step 1
X -> XY|SS  Step 2
Y -> XY|SS  Step 3

So I would do

S -> XY            From step 1
S -> YYY           From step 2
S -> SSYY          From step 3
S -> SSSSY         From step 3
S -> SSSSSS        From step 3
S -> bbSSSSS       From step 1
S -> bbbbSSS       From step 1
S -> bbbbbbSSS     From step 1
S -> bbbbbbbbSS    From step 1
S -> bbbbbbbbbbS   From step 1
S -> bbbbbbbbbbbb  From step 1

bbbbbbbbbbbb 

So I know how to generate words like this but how to find out whether the language is finite or infinite?

babou
  • 19,645
  • 43
  • 77
Dana
  • 355
  • 2
  • 5
  • 13

4 Answers4

9

A language is infinite if it can generate infinitely many words. In order to prove that a language generated by a grammar is infinite, you need come up with some infinite list of words generated by the grammar. Proving that the language is finite is slightly more messy—you need to make a list of all possible derivations, and show that all of the terminate.

In your case, you have the derivation $S\to^*SSSS$, which suggests that the language is infinite. Can you come up with an infinite list of words generated by this grammar?

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514
6

The only way a finite system (grammar) can generate an infinite set of words is by repeating stuff: you start at $S$ and keep expanding until you hit a non-terminal that you've already expanded. So now you can just copy the sub-tree corresponding to that symbol and paste it. You can do this forever so it must be that the grammar generates infinitely many words.

You can use a directed graph to represent a context-free grammar: think of the grammar specification as being an adjacency list of some graph (In the first rule, there is an edge from $S$ to $X$ and from $X$ to $Y$ and so on). Now, the grammar generates an infinite set if and only if the graph is cyclic (this needs proof). A technicality is that the cycle should generate something other than the empty word. This gives a general procedure for deciding whether a grammar generate an infinite set of words or not.

The graph corresponding to your example looks like:

enter image description here

There are lots of cycles. $S^*$ ($(bb)^*$ is a subset of the language that the grammar generates) is one example.

mrk
  • 3,748
  • 23
  • 35
5

The language is infinite iff its grammar can generate an infinite number of words, or equivalently iff a recognizing automaton can recognize an infinite number of words.

This is something that you have to prove.

For that purpose you can rely on some facts.

  • a language is infinite if and only if contains words of unbounded length, i.e. longer than any size you may choose.

    (Useful and easy exercise: prove that the total number of words of size less than some integer $n$ built on a finite alphabet $\Sigma$ is always finite. - this proves the above statement)

    This tells you two things

    • that if you can show that the language contains words of unlimited size, it is indeed infinite.
    • that you can always count on that property to prove language infinity
  • To prove that words can have unlimited size, you must use an induction proof. And, when using a grammar definition of the language, it will have to be based on non-terminals as they are the only part of a derived string that can be replaced by something longer.

For example, if the initial symbol derives on a string that contains it plus other symbols, including a terminal, and only non-terminal that derive on a terminal word ... the you may think on using that for an induction proof.

Well ... what about trying?

And try to think why I specify these constraints. (remember, the words of the language contain only terminals)

By the way, the recursion must sometimes be on another non-terminal than the initial one (S). But it will do in your case.

babou
  • 19,645
  • 43
  • 77
3

I recall from the 1960's a technique for at least some classes of formal grammars for finding cycles (loops generating an infinite language ) in a grammar. A Boolean matrix is created for all rules or productions in the grammar. All terminals and non-terminals are listed as the vertical side of the matrix and the horizontal side of the matrix. There are zeros in the matrix where vertical side of the rule does not directly produce a symbol across the horizontal side. There ones where the left part (row title) directly produces the right part (column title)

This captures all the direct transitions. This matrix is multiplied times its transpose. The martix multiplication is simplified by the use of Warshall's algorithm. The result is the reflexive transitive closure of the grammar. The grammar is cycle free if the right to left diagonal of the resulting matrix is all zeros.