7

There's a famous theorem that every infinite Turing-recognizable language has an infinite decidable subset. The standard proof of this result works by constructing an enumerator for the Turing-recognizable language, then including the first enumerated string in the decidable language, then the first string that comes after it lexicographically, then the first string that comes after that lexicographically, etc. Since this set is enumerated by a Turing machine in lexicographically increasing order, it's decidable.

This construction works, but it doesn't seem to give a very "natural" example of an infinite decidable subset. In particular, the only way I can think of to describe the subset is to point at a specific enumerator for the language, define a recurrence relation from it, and then define the language from that recurrence relation.

Is there an alternative construction that produces an infinite decidable set from an infinite Turing-recognizable language that is less dependent on the particulars of how a specific enumerator runs?

Raphael
  • 73,212
  • 30
  • 182
  • 400
templatetypedef
  • 9,302
  • 1
  • 32
  • 62

2 Answers2

2

Let $\Sigma$ be an alphabet, let $\mathcal{R}$ be the collection of all recognizable subsets (languages) of $\Sigma^{*}$ and let $\mathcal{D}$ be the collection of all decidable subsets of $\Sigma^{*}$.

Say that $n \in \mathbb{N}$ is a code for $L \in \mathcal{R}$ if the Turing machine encoded by the number $n$ (via some Gödel encoding of Turing machines) recognizes $L$. Let us write $L_n$ for the set that is recognized by the $n$-th Turing machine.

Similarly, say that $n$ is a code of $D \in \mathcal{D}$ if the $n$-th Turing machine is the decider for $D$. If $n$ is the code of a decider (not every number is!) then we write $D_n$ for the set it decides.

Consider the statement: "For every infinite $L \in \mathcal{R}$ there is an infinite $D \in \mathcal{D}$ such that $D \subseteq L$."

You are asking whether we can prove the statement by giving an effective procedure which assigns to every $L$ a corresponding $D$ in such a way that $D$ does not depend on the choice of the code of $L$. (Clearly, we can do this inefectively by an application of the axiom of choice.) More precisely, the question seems to be whether there a computable map $f$ such that

  1. if $n$ codes an infinite $L \in \mathcal{R}$ then $f(n)$ codes an infinite $D \in \mathcal{D}$ such that $D \subseteq L$, and
  2. if $L_n = L_m$ then $D_{f(n)} = D_{f(m)}$.

The answer is no, there is no such map: you must appeal to a particular recognizer for $L$ and obtain a $D$ which depends on the recognizer.

To prove that this is really so we appeal to Rice's theorem. Suppose we had such an $f$. There exists $n, m, a \in \mathbb{N}$ such that $a \in D_n$ and $a \not \in D_m$. The subset $S \subseteq \mathcal{R}$ defined by $$L \in S \iff \exists n \,.\, L = L_n \land a \in D_{f(n)}$$ is a non-trivial decidable subset of $\mathcal{R}$, which violates Rice's theorem.

Andrej Bauer
  • 31,657
  • 1
  • 75
  • 121
0

Let us say we have a Turing recognizable language $L$ which is not decidable, let its recognizer be $R$. Say by some method you obtain a subset of $L$, $D$ which is decidable. As $D$ is decidable we can always construct a Turing machine $T$, which when given an input $w \in D$ prints the difference between the indices of $w$ and the next lexicologically larger string that belongs to $D$ ( indices are assigned to all strings in $\sum^{*}$, when they are listed in lexicological order ). Now we build an enumerator $E$ for $L$. First $E$ prints the lexicological smallest string in $D$ ( let it be $w_1 $ ). It then dovetails over strings that do not belong to $D$ using $R$. And when it has $T(w_1)-1$ strings which halt when run on $R$ and are lexicologically larger than $w_1$, it prints all the strings which have halted while dovetailing using $R$ till now ( some of them might be smaller than $w_1$ ), followed by $w_2$ ( the second string in $D$, by lexicological order ). $E$ repeats the process in a similar fashion indefinitely. It is clear that when we follow your approach using $E$ we get $D$. The only difference is instead of always taking the immediate next lexicologically larger string than the previous one, which $E$ prints, we have a computable function $T$ which tells how many lexicologically larger strings to skip before taking the next one to be a part of $D$.

advocateofnone
  • 3,179
  • 1
  • 27
  • 44