A Myhill-Nerode type characterization of the regular languages using fooling sets?

Question

Ultimately, my question is whether it's possible to exactly characterize the regular languages in terms of fooling sets. To help explain my motivation for asking this, here's a quick exposition.

Let $L$ be a language. A set of strings $S$ is called pairwise distinguishable relative to $L$ if for any two distinct strings $x, y \in S$, there is a string $w \in \Sigma^*$ such that exactly one of $xw$ and $yw$ belongs to $L$.

Assuming the axiom of choice, the Myhill-Nerode theorem can be framed in terms of pairwise distinguishable sets:

Theorem: let $L$ be a language. Then $L$ is nonregular if and only if there is an infinite set $S$ that is pairwise distinguishable relative to $L$.

Now, let's consider a related definition. If $L$ is a language, a set $S \subseteq \Sigma^* \times \Sigma^*$ is called a fooling set for $L$ if for any distinct pairs $(x_i, y_i), (x_j, y_j) \in S$, then $x_i y_i \in S, x_j y_j \in S$, and at least one of $x_i y_j$ and $x_j y_i$ isn't in $L$.

I'm curious whether the following statement is true:

Conjecture: let $L$ be a language. Then $L$ is nonregular if and only if there is an infinite set $S$ that is a fooling set for $L$.

I can prove one direction of this using the Myhill-Nerode theorem: if there is an infinite fooling set $S$ for a language $L$, then we can take the first halves of each pair in $S$, gather them into a set $S'$, and we'll have an infinite set $S$ that is pairwise distinguishable relative to $L$, so $L$ is nonregular.

However, I keep getting stuck trying to go the other way. I'm not sure how to show that if $L$ is nonregular, then there must be an infinite fooling set for $L$. The equivalent direction of the Myhill-Nerode theorem follows by the contrapositive and the fact that the binary relation "is indistinguishable relative to $L$" is an equivalence relation. We don't - I believe - have a similar equivalence relation when it comes to fooling sets.

Does the other direction of implication hold here? Or is this a strictly weaker criterion than the Myhill-Nerode theorem?

Thanks!

score 2 · Accepted Answer · answered Nov 03 '15 at 20:25

Your conjecture is refuted by the language $\{ 0^n : \text{$n$ is not a power of $2$}\}$. Let $\{(x_i,y_i) : i \in \mathbb{N}\}$ be an infinite fooling set for this language. We can identify $x_i,y_i$ with integers. These pairs have to satisfy the following conditions:

$x_i + y_i$ is not a power of $2$.
For each $i \neq j$, either $x_i + y_j$ is a power of $2$ or $x_j + y_i$ is a power of $2$.

Color $\mathbb{N}^2$ with two colors according to whether $x_i + y_j$ is a power of $2$ or $x_j + y_i$ is a power of $2$, where $i < j$. Ramsey's theorem shows that without loss of generality we can assume that for all $i < j$, $x_i + y_j$ is a power of $2$.

If $x_i = x_j$ for $i < j$ then $x_j + y_j$ is a power of $2$, contradicting (1) above. Similarly, if $y_i = y_j$ for $i < j$ then $x_i + y_i$ is a power of $2$, contradicting (1) above. Thus all $x_i$ are distinct and all $y_j$ are distinct.

Assume without loss of generality that $x_1 < x_2$. Let $x_1 + y_3 = 2^a$ and $x_2 + y_3 = 2^b$. Choose $j$ so that $y_j > 2^b$. Let $x_1 + y_j = 2^c$. Then $2^c < x_2 + y_j < 2^c + 2^b < 2^{c+1}$, and we obtain a contradiction.

A Myhill-Nerode type characterization of the regular languages using fooling sets?

1 Answers1