Can there be a context-sensitive pumping lemma?

Question

A "pumping" property (words of a certain length imply the existence of loops in the language-defining mechanism) are known to exist for regular and context-free languages and a few more (usually used to disprove a language's membership to a certain class).

Within the discussion around this question, Daisy's answer suggests that there can't be a pumping lemma for context-sensitive languages - since they're so complex.

Is that true - can it be shown that there can't be some type of pumping property - and is there a good reference for that (or against that)?

Georg Zetzsche · Accepted Answer · 2016-07-20T17:31:19.980

Here is some evidence that there is no pumping lemma for the context-sensitive languages.

Of course, an answer hinges on the question what constitutes a pumping lemma. The weakest reasonable definition I could think of is this: A language class $\mathcal{C}$ has a pumping lemma if there is a decidable ternary predicate $P(\cdot,\cdot,\cdot)$ where $P(g,w,d)$ means:

$g$ is a word encoding a language $L(g)$ from $\mathcal{C}$ (think: grammar),
$w$ is a word in the language encoded by $g$
$d$ is a word encoding a pumpable computation/derivation for $w$ (think: NFA computation with repeated state or CFG derivation tree with repeated nonterminal). Here, pumpable means: there exist infinitely many words in $L(g)$.

Moreover, we want that given a language $L$ in $\mathcal{C}$ encoded by $g$, for every sufficiently long word $w\in L$, there exists a word $d$ such that $P(g,w,d)$.

For example, the pumping lemma for regular languages would give rise to the predicate "$g$ encodes an $\varepsilon$-free NFA and $d$ encodes a run that repeats a state and reads $w$". For suitable encodings, this clearly satisfies the above conditions.

Now let us show that such a predicate does not exist for the context-sensitive languages.

Observe that if a language class has a pumping lemma, then the infinity problem (Given a grammar, does it generate an infinite language?) is recursively enumerable: Given an encoding $g$, we can enumerate words $w$ and $d$ and check whether $P(g,w,d)$. If we found such $w,d$, we answer 'yes', otherwise, we continue the enumeration.

However, we show that the infinity problem for the context-sensitive languages is not recursively enumerable. Recall that $\Pi_2^0$ is a level of the arithmetic hierarchy that strictly includes the recursively enumerable languages. Hence, it suffices to prove:

Claim: The infinity problem for the context-sensitive languages is $\Pi_2^0$-complete.

It is well-known that the infinity problem for recursively enumerable languages is $\Pi_2^0$-complete (more often, one finds the formulation that the finiteness problem is $\Sigma_2^0$-complete). Hence, it suffices to reduce the latter problem to the infinity problem for the context-sensitive languages.

Given a TM $M$, we construct an LBA $A$ for the language $$ \{u\#v \mid \text{$v$ is a shortlex-minimal accepting computation of $M$ on input $u$}\}. $$ Then, $L(A)$ is infinite iff $L(M)$ is infinite, which completes our proof.

Update: Tried to be clearer. Update: Added example.

Can there be a context-sensitive pumping lemma?

1 Answers1