4

Let $L$ be the language of infinite words in $\{0,1\}^\omega$ such that any finite prefix of a word in $L$ has at least as many $0$'s as $1$'s. Is $L$ büchi recognisable?

I think that $L$ is not $\omega$ regular, but standard tricks such as complementation or intersecting with some other language don't seem to work.

John L.
  • 39,205
  • 4
  • 34
  • 93
Jerry Tao
  • 43
  • 4

3 Answers3

2

As you suspected, $L$ is not büchi recognisable/$\omega$-regular. Here is a proof.


Towards a contradiction, suppose $L$ is $\omega$-regular. Then $$L= A_1B_1^\omega\cup A_2B_2^\omega\cup\cdots\cup A_nB_n^\omega$$ for some $n$, where $A_i$ and $B_i$ are regular languages for all $i$.

Consider $s=0^11^10^21^20^31^3\cdots\in L$. WLOG, suppose $s\in A_1B_1^\omega$. Suppose $s=at$, where $a\in A_1$, $t\in B_1^\omega$.

Let $p$ be a pumping length for $B_1$ as in the general version of the pumping lemma for regular languages, as stated here on Wikipedia.

Consider $\ell=1^{\max(|a|, 2p)}$ that appears in $s$. $\ell$ must appear after $a$ since $0^{|a|}$ appears before $\ell$ in $s$. In other words, $\ell$ is a substring of $s$ without $a$, which is $t$. Since $t\in B_1^\omega$, $\ell$ is the concatenation of a nonempty suffix of a word in $B_1$, zero or more words in $B_1$ and a possibly-empty prefix of a word in $B_1$. There are two cases.

  • $\ell$ contains a word in $B_1$.
    Let $g$ be that word, which contains only $1$s.

  • $\ell$ is a suffix of a word in $B_1$ followed by a possibly-empty prefix of a word in $B_1$.
    Hence $B_1$ contains a word that contains at least $\frac{2p}2=p$ consecutive $1$s. Let $uwv$ be such a string, where $w=1^p$.

    We can write $uwv=uxyzv$ for some string $x,y,z$ with $|y|\ge1$ such that $uxy^iz\in B_1$ for all $i\ge0$, thanks to the general version of the pumping lemma. Since $w$ is all $1$s, so is $y$. Hence for $i$ large enough, $uxy^iz$ has more $1$s than $0s$. Let $g$ be such a word, i.e. $g\in B_1$ has more $1$s than $0$s.

In both cases, we have identified a word $g\in B_1$ that has more $1$s than $0$s. Then $ag^\omega\in A_1B_1^\omega\subseteq L$. However, $ag^\omega\notin L$ since its prefix $ag^{|a|+1}$ has more $1$s than $0$s. This is a contradiction.

John L.
  • 39,205
  • 4
  • 34
  • 93
0

This kind of proposition can also be proved in a more abstract automata independent way.

Suppose there exists a Büchi automaton $\mathcal{A}$ that recongnizes $L$. In order to accept or reject an infinite word, when given as input a finite prefix $x$ such automaton would need to reach a state that keeps the count of how many $0$s and $1$s there are in $x$, or alternatively the difference of such numbers. In other words, each state of $\mathcal{A}$ should be in a bijective surjective correspondence with $\mathbb{N}$, i.e. there must be a (at least one) distinct state for each natural number. This is absurd since by definition $\mathcal{A}$ has a finite number of states.

This reasoning can be similarly applied to many ($\omega$-)languages that "require counting" in order to accept them and classes of automata that have finite memory, e.g. automata that have a finite number of states and no additional memory like a stack or a tape.


Addendum: let's formalize what stated above.

Suppose there exists a Büchi automaton $\mathcal{A} = (Q,\Sigma,\delta,q_0,F)$ that recongnizes $L$.

Let $\{x_i\}_{i \in \mathbb{N}}$ be any sequence of words over $\Sigma$ such that for all $i \in \mathbb{N}$, $x_i \in L$ and $x_i$ has $n$ occurrences of $0$ and $m$ occurrences of $1$, where $n-m=i$.

For all $i \in \mathbb{N}$, there exists a $\omega$-word which is in $L$ and for which $x_i$ is prefix. We can conclude that for all $i \in \mathbb{N}$ the set $\delta^*(q_0, x_i)$ is non empty.

Let $i < j$. There exists $\beta \in \Sigma^\omega$ such that $x_i \cdot \beta \notin L$ and $x_j \cdot \beta \in L$ *. Let $\sigma$ be an accepting computation of $x_j \cdot \beta$, and let $p \in \delta^*(q_0, x_j)$ be the state that is reached in $\sigma$ after reading $x_j$. If $p$ were in $\delta^*(q_0, x_i)$, it would imply that there exists an accepting computation of $x_i \cdot \beta$, which is a contradiction.

We can conclude that for all $j \in \mathbb{N}$ $$\exists p_j\in \delta^*(q_0, x_j): \forall i < j: p_j \notin \delta^*(q_0, x_i)$$ and thus for all $j \in \mathbb{N}$ $$\exists p_j\in \delta^*(q_0, x_j): \forall i \in \mathbb{N}: p_j \notin \delta^*(q_0, x_i)$$

In other words, there exists a sequence $\{p_i\}_{i \in \mathbb{N}}$ of distinct states, and hence $p: \mathbb{N} \to Q$ i an injective function. It follows that $|\mathbb{N}| \leq |Q|$, which is absurd because, by definition, $Q$ is finite.

* Take for instance $\beta = 1^{j} \cdot 0^\omega$.

matteo_c
  • 203
  • 1
  • 6
0

Here is a direct proof:

Assume towards contradiction that $L$ is Büchi-recognizable and let $A = \langle \Sigma, Q, q_0, \delta, \alpha\rangle$ be a nondeterministic büchi automaton for $L$. Let $n = |Q|$ denote the number of states in $A$, and consider the word $w = (0^n\cdot 1^n)^\omega$. As $w\in L$, there is an accepting run $r = r_0, r_1, r_2, \ldots$ of $A$ on $w$. As $r$ is accepting, and $|Q|$ is finite, it follows that there is a reachable state $q\in Q$ and an integer $k\geq 1$ such that $q \xrightarrow{(0^n \cdot 1^n)^k} q$ is a cycle that visits a state $\alpha$. Indeed, such a cycle is traversed by the run $r$: consider consecutive sub-runs $r_i$ of $r$ such that $r_i$ is of the form $r_i = q_i \xrightarrow{(0^n\cdot 1^n)^{k_i} }q_{i+1}$ and $r_i$ visit a state in $\alpha$. Then, use the pigeonhole principle.

Now consider the cycle $r_c = q \xrightarrow{(0^n \cdot 1^n)^k} q = q\xrightarrow{0^n} s_0 \xrightarrow{1} s_1 \xrightarrow{1}s_2 \cdots \xrightarrow{1} s_n \xrightarrow{(0^n\cdot 1^n)^{k-1}}q$. Again, as the sub-run $r_s = s_0, s_1, \ldots, s_n$ visits $n+1 > |Q|$ states, it follows that $r_s$ traverses a nontrivial cycle. Formally, there are $0\leq j < l \leq n$ with $s_j = s_l$. Now recall that $q$ is reachable, and consider a finite word $x$ with $q_0 \xrightarrow{x} q$, and consider the following cycle that is obtained from $r_c$ by pumping a cycle in the the sub-run $r_s$ $t$ times: $$r_{c_t} = q\xrightarrow{0^n} s_0 \xrightarrow{1} s_1 \xrightarrow{1}s_2 \cdots (s_j \xrightarrow{1^{l-j}} )^t \xrightarrow{1} s_{l+1} \cdots \xrightarrow{1} s_n \xrightarrow{(0^n\cdot 1^n)^{k-1}}q $$

We can now consider the following run of $A$ over the infinte word $w_t = x\cdot (0^n \cdot 1^{n+(t-1)\cdot (l-j)} \cdot (0^n\cdot 1^n)^{k-1})^\omega$: $$ r’ = q_0 \xrightarrow{x} r^\omega_{c_t} $$

On the one hand, the run $r’$ is accepting as the sub-cycle $r_{c_t}$ visits a state in $\alpha$, yet on the other hand, as $l > j$, we have that $(l-j) > 0$, and so by choosing $t = |x|+ 2$, we get that the word $w_t$ has a prefix with more 1’s than 0’s, and we have reached a contradiction.

Bader Abu Radi
  • 5,494
  • 1
  • 11
  • 41