16

How large is the set of all Turing machines? I am confident it is infinitely large, but what kind of infinitely large is its size?

Kevin
  • 483
  • 1
    As noted in answers and comments to answers, the definition of Turing machine is in terms of some finite sets (of symbols and states notably). But the class of finite sets (and even that of singleton sets) is a proper class, so it cannot be measured by any cardinal. If you want to talk of a set of Turing machines, you need to rein in the trivial freedom of choosing these base sets, for instance by requiring them to be initial segments of the natural numbers. Only then does the question make sense. – Marc van Leeuwen Mar 07 '15 at 13:47
  • @Marc That's a good point. I asked this question because I'm trying to solve a problem (specifically, this one), and in that problem, I am primarily interested in Turing machines that process strings encoding pairs of integers. – Kevin Mar 07 '15 at 20:01

4 Answers4

25

By an informal argument, Turing machines roughly correspond to programs written in some programming language. Each program is a finite string in ASCII or unicode or binary (or another finite alphabet of some kind).

We might imagine writing a naive program that outputs all possible strings in lexographical order, running the compiler on each one, and throwing it out if there's an error. This program will ultimately list out every program (albeit, "hello world" would take a very long time to produce). Thus, we have a program which enumerates all programs. The set of all turing machines is countable.

Tac-Tics
  • 2,353
  • 1
    Are we sure that all programs are finite? Turing machines were conceived as having have infinite length tape, and any program approximating a Turing machine needs to incorporate input as well - which can easily be infinite in the case of stream processing. (1948 essay, "Intelligent Machinery", Turing) – Gustav Bertram Mar 06 '15 at 11:13
  • 9
    @GustavBertram: Yes. Depending on your perspective, Turing machines may either always start with an empty tape or take a tape as input, but in the latter case the initial state of the tape is finite. You could presumably extend to allow infinite tapes as long as the initial contents are computable (i.e. representable by another Turing machine that starts with an empty tape and produces the initial tape on demand) but allowing an arbitrary infinite tape has no meaningful interpretation as a model of computation. – R.. GitHub STOP HELPING ICE Mar 06 '15 at 11:36
  • 1
    @GustavBertram Turing machines are finite in the sense that they have a finite number of states and use a finite alphabet, which means there is a finite number of possible transition functions. Transition functions are, basically, programs and they are what matters. – Bakuriu Mar 06 '15 at 13:56
  • Have you read Turing's original paper? One of the main conclusions is that your hypothetical "naive program" cannot exist: there is no Turing machine that can distinguish between "circle-free" (i.e. valid) machines (programs) and non-valid programs. But your program is unnecessary anyway: the fact that programs are finite strings in a finite alphabet means they are mappable to integers, and therefore countable. – Kyle Strand Mar 06 '15 at 18:35
  • ....Although on second thought, it's not clear that you're necessarily limiting yourself to "circle-free" machines, only to strings that appear to describe Turing machines of some sort. You should probably clarify that this is what you mean by "compiler" (a concept foreign to Turing, obviously). Either way, the "run through all strings and run some test on each one" argument is unnecessary. – Kyle Strand Mar 06 '15 at 18:42
  • 2
    @KyleStrand I think Tac-Tics' argument is that the compiler takes source code as input and outputs the machine that executes the same code. The programs that are thrown away are those with syntax errors, missing declarations, etc. (the kind of errors a compiler would catch anyway). – Davidmh Mar 06 '15 at 21:16
  • @Davidmh Right, hence my second comment. – Kyle Strand Mar 06 '15 at 21:22
  • @KyleStrand: Even if you do want to exclude non-halting programs, you can do it by dovetailing... – Micah Mar 06 '15 at 21:45
  • @Micah Do you mean that as the program runs, it should also dovetail execution of compiled programs, so that as they terminate, they can be added to the list? I guess that makes sense. The curious side-effect would be, of course, that for any given non-halting program, there will never be any definite point in time at which this "valid machine enumerator" would know for sure that it is non-halting, so every new non-halting machine would be added to an endlessly growing list of invalid programs that must be executed forever. – Kyle Strand Mar 06 '15 at 21:53
  • @R..: "No meaningful interpretation as a model of computation". Actually it is possible to come up with a meaningful interpretation - for instance, infinite tapes can be used to model a stream of input that is theoretically unbounded - such as a series of readings from a sensor, or the user typing interactively :) – psmears Mar 07 '15 at 13:30
  • @psmears: In that case I don't think you can say you're modeling a computation. Rather you're modeling physics. In any case the machine can only use a finite number of tape slots before it halts, so if you're only enumerating machines that halt (see above) all of the infinite-tape machines are equivalent to an appropriate finite-tape machine anyway. – R.. GitHub STOP HELPING ICE Mar 07 '15 at 15:10
  • @R.. Of course it is modelling a computation! There's no physics being modelled by the Turing machine :) The trouble with many simple models of computation (eg finite-input Turing machines, functions, ...) is that they are not very good at modelling simple programs like "repeat forever: input a number; add it to total; output running average" - which are entirely valid computations (and ones with properties we may want to verify using models). And the answer above doesn't suggest only enumerating halting machines - obviously you have to examine pre-halt states for this to be a useful model. – psmears Mar 07 '15 at 16:02
  • @R..: It's perfectly fair to only count finite-nonblank-tape Turing machines in the "what size is the set of all Turing machines" - as long as you're careful to define what you mean by "all" - but saying that there is "no meaningful interpretation as a model of computation" is too strong a statement, as there do exist valid and useful interpretations :) – psmears Mar 07 '15 at 16:13
21

Have you seen that you can encode Turing machines by natural numbers? It is done on the way to showing there is a universal Turing machine. otherwise, each machine is a finite string of symbols from some alphabet

Ross Millikan
  • 383,099
9

It depends on how you define "distinct" in terms of turing machines. In general, two (one-tape) turing machines are not "distinct" if there is a component-wise bijection that goes from one's 7-tuple to the other's 7-tuple.(See http://en.wikipedia.org/wiki/Turing_machine#Formal_definition for the 7-tuple I am referring.). This leads to a countable number of turing machines. If instead if "distinct" is defined in a way that the 7-tuples have to be identical to be not "distinct", then there are uncountably many turing machines. In this case, there would not be a "set" of all turing machines, instead we would have a "class" of all turing machines.

SE318
  • 503
  • 8
    The only way you could get an uncountable infinity of Turing machines is if you allow states or symbols to be chosen from an uncountable infinite set. If the set of tape symbols and the set of states are each required to be a subset of a countable infinite set of possible choices, then the set of possible Turing machines will also be a countable infinite set. – kasperd Mar 06 '15 at 08:35
  • @kasperd This is true. But, since any Turing machine has a finite number of states, people don't tend to be too fussy about what the "names" of those states are. So I think this answer makes a helpful point. – David Richerby Mar 06 '15 at 09:32
  • 5
    @David If names can be anything, there would still not be an uncountable set of all Turing machines, but rather a class of all Turing machines (larger than any set). – gmatht Mar 06 '15 at 13:41
  • @gmatht, I do believe it would be a class, not a set, but it would still be uncountable(in the sense that you cannot create a bijection from this class to the natural numbers) – SE318 Mar 06 '15 at 14:32
  • @gmatht's comment is nontrivial; please add it to the answer, SE318! – Kyle Strand Mar 06 '15 at 19:46
  • The answer is right in that it would uncountable, but then OPs question itself would be wrong, as it assumes there is a set. – gmatht Mar 07 '15 at 01:47
8

Tac-Tics's answer is almost correct, but the "naive program" argument is unnecessary and possibly incorrect (as per my comment).

The observation that Turing machines can be described by finite strings in a finite alphabet is sufficient: by mapping the letters of the alphabet to integers in some base (e.g. binary or, as Turing does, decimal without the use of digits 8 and 9), you can map all programs to integers. Thus the set of all (valid or non-valid)1 machines is mappable to a subset of the integers; thus both the set of all machines and the (smaller) set of all valid machines are countable.

1: By "valid," I mean either Turing's "circle-free" concept or the later equivalent-but-inverted "halting" concept (introduced when Turing's work was reframed as the "halting problem").

  • I think your answer and Tac-Tics's answer may actually be identical. Here's why. When you cite "the observation that Turing machines can be described by finite strings in a finite alphabet," I'm not quite sure what you mean, but the context suggests that you're positing an injective map from the set $M$ of Turing machines to the set $C^$ of finite strings over some finite alphabet $C$. Many methods of constructing such a map will produce one whose range $P$ is a proper subset of $C^$. – Vectornaut Mar 07 '15 at 05:03
  • Tac-Tics is talking about the same map, but thinking of it as a bijective map from $P$ to $M$. It's well known that you can arrange for $P$ to be a computable subset of $C^$. Tac-Tics's "compiler" is a machine that tests whether a given element of $C^$ is in $P$. – Vectornaut Mar 07 '15 at 05:03
  • @Vectornaut Agreed. My point is that you don't actually need to hypothesize a decision machine to evaluate the mapping in order to recognize that the set of machine-descriptions is a subset of the set of strings in the language. – Kyle Strand Mar 07 '15 at 05:06
  • It's true that you don't need one, but you might want one. As Tac-Tics alludes, the process of verifying that an element of $C^*$ lies in $P$ and then producing the corresponding Turing machine is very familiar to many people: it's the process of compiling a program. The reverse process, decompiling, is much more esoteric. So, I think the decision to focus on the compiling direction is very understandable from a pedagogical point of view. – Vectornaut Mar 07 '15 at 05:12
  • @Vectornaut Fair enough, but Tac-Tic's answer is phrased so as to make it seem that the compiling process is part of the countability argument. It is not, or at least ought not be. – Kyle Strand Mar 07 '15 at 08:42
  • If things are phrased carefully, I think the compiling process really ought to be part of the argument—and in a subtle way, it's part of your argument too. You end with the line, "Thus the set of all... machines is mappable to a subset of the integers." Tac-Tics starts with the same line, but backwards, saying essentially, 'A subset of the integers (computed by the compiler) is mappable to the set of all machines.' Tac-Tics discusses the subset at length, while you mention it only in passing, but in both cases it's there. – Vectornaut Mar 07 '15 at 22:56
  • @Vectornaut That's unnecessarily constructivist, as well as subtly circular. The recognition that all possible machines have representations as natural numbers necessarily precedes the "compiler" argument: the set of potential machine-descriptions being fed to the compiler must be a super-set of the complete set of machine-descriptions in order for the compiler to correctly enumerate all possible valid machines, and there is nothing about the nature of the compiler (or its feasibility) that has any bearing on whether this condition holds. – Kyle Strand Mar 08 '15 at 00:11
  • It might work as a helpful metaphor while trying to intuitively answer the question--e.g., one could ask, "do you believe that the set of all possible inputs for a compiler is countable? Why or why not?" But even there, it's clear that the solution to the initial question has to do with these inputs, i.e., the strings in the language, not with the compiler. – Kyle Strand Mar 08 '15 at 00:15