22

Is there an algorithm/systematic procedure to test whether a language is context-free?

In other words, given a language specified in algebraic form (think of something like $L=\{a^n b^n a^n : n \in \mathbb{N}\}$), test whether the language is context-free or not. Imagine we are writing a web service to help students with all their homeworks; you specify the language, and the web service outputs "context-free" or "not context-free". Is there any good approach to automating this?

There are of course techniques for manual proof, such as the pumping lemma, Ogden's lemma, Parikh's lemma, the Interchange lemma, and more here. However, they each require manual insight at some point, so it's not clear how to turn any of them into something algorithmic.

I see Kaveh has written elsewhere that the set of non-context-free languages is not recursively enumerable, so it seems there is no hope for any algorithm to work on all possible languages. Therefore, I suppose the web service would need to be able to output "context-free", "not context-free", or "I can't tell". Is there any algorithm that would often be able to provide an answer other than "I can't tell", on many of the languages one is likely to see in textbooks? How would you build such a web service?


To make this question well-posed, we need to decide how the user will specify the language. I'm open to suggestions, but I'm thinking something like this:

$$L = \{E : S\}$$

where $E$ is a word-expressions and $S$ is a system of linear inequalities over the length-variables, with the following definitions:

  • Each of $x,y,z,\dots$ is a word-expression. (These represent variables that can hold any word in $\Sigma^*$.)

  • Each of $a,b,c,\dots$ is a word-expression. (Implicitly, $\Sigma=\{a,b,c,\dots\}$, so $a,b,c,\dots$ represent a single symbol in the underlying alphabet.)

  • Each of $a^\eta,b^\eta,c^\eta,\dots$ is a word-expression, if $\eta$ is a length-variable.

  • The concatenation of word-expressions is a word-expression.

  • Each of $m,n,p,q,\dots$ is a length-variable. (These represent variables that can hold any natural number.)

  • Each of $|x|,|y|,|z|,\dots$ is a length-variable. (These represent the length of a corresponding word.)

This seems broad enough to handle many of the cases we see in textbook exercises. Of course, you can substitute any other textual method of specifying a language in algebraic form, if you like.

reinierpost
  • 6,294
  • 1
  • 24
  • 40
D.W.
  • 167,959
  • 22
  • 232
  • 500

6 Answers6

1

The way I think we could handle this problem is devising a language which is context free if and only if a given word is in a recursively enumerable language, that is, if a Turing machine halts on a given input. If we can do this, we can reduce the problem $L \in CFL$ to the halting problem and deem it undecidable.

Let $L$ be a recusively enumerable language and let $M$ be a Turing machine such that $\mathcal{L}(M) = L$. Let $$ L_{M} = \{a^n c^k b^n | n > 0 \land (k = n \iff M(n)\uparrow) \land (k = 0 \iff M(n)\downarrow)\} $$

We have that $L_{M}$ is context free if and only if for each $n$ $k = 0$, that is $$ L_{M} \in CFL \iff \forall n ~M(n)\downarrow $$ which is undecidable. Therefore, there can't be an algorithm which can decide, in general, if $L \in CFL$.

ecmm
  • 81
  • 3
0

By Rice's theorem, to see if the language accepted by a Turing machine has any non-trivial property (here: being context free) is not decidable. So you would have to restrict the power of your recognizing machinery (or description) to make it not Turing complete to hope for an answer.

For some language descriptions the answer is trivial: If it is by regular expressions, it is regular, thus context free. If it is by context free grammars, ditto.

vonbrand
  • 14,204
  • 3
  • 42
  • 52
-1

There are Several Methods to Solve this. Let me Discuss one by one

  1. Try to make a Context-free grammar, and then check whether all the production on the left-hand side is exactly one non-terminal symbol, then language is context-free.
    Ex:- like if $aA\rightarrow Bc$ is in the context-free grammar then it's not context-free language .

  2. Take a stack and push element of language into stack, and pop the element from stack when newly element wants to push into stack, like in a language
    $L = \{a^n b^n c^n : n \in \mathbb{N}\}$
    then push all $a$'s into stack and pop one $a$ when one $b$ wants to push into stack. Then there is no element which can nullify the remaining $n$ $a$'s. So this language is not context free. Must remember in context-free language only one comparison is allowed. But here are two comparisons. It's context sensitive language (CSL), which allow two comparisons.

  3. Next approach is not straight forward. Means you can only check when language is not Context-free language (CFL). It's called Pumping Lemma.

greybeard
  • 1,172
  • 2
  • 9
  • 24
-2

Yes there a concept of Pumping Lemma which is a negativity test.

-3

Try JFLAP software if you just want to check a CFG. You can maybe even ask JFLAP developers to give you the code or algorithm for the software. you can get JFLAP from here http://www.jflap.org/jflaptmp/ it is free however it requires JDK or JRE or something. Or maybe you can try some other similar softwares and their developers.

-4

Any language is is accepted by a Push Down Automata, is a CFL. Here is a detailed breakdown to determine whether a language is CFL or not. check if language is CFL or not

SiluPanda
  • 569
  • 1
  • 3
  • 12