Is a TM that is simulated by a universal TM theoretically inherently slower than the TM itself?

Question

When a CPU simulates a certain program, as they do all the time, this is inherently slower than if the program would have been "baked in" into the hardware and computed directly. We know this from practical experience.

My question is: Is there some theoretical relationship that shows that a program that is simulated by a universal program, is inherently slower than that program executed by itself?

score 5 · Answer 1 · edited Apr 13 '17 at 12:48

The way i see it, talking about slowdowns on simulations of a specific Turing machine $M_0$ doesn't make much sense. I could always just run $M_0$ and call this a simulation, which will result in no slowdown. I could also hardwire the code of $M_0$ , and in case the input was $M_0$, use some better algorithm (as D.W. did in his answer).

The more interesting question here is (in my opinion at least), what is the optimal slowdown achievable when simulating an arbitrary Turing machine $M$ on some input $x$ ? (asymptotically, in terms of $|x|$ and the length of the description of $M$)

We look at all possible inputs, and examine the worst case slowdown (perhaps for some machine $M_0$ you can do a better job, but here we consider the worst case running time).

More formally, Let $\mathcal{U}(\langle M\rangle,x)$ denote the universal Turing machine, which takes as input an encoding of a machine $M$ and some string $x$, and outputs $M(x)$, or does not halt in the case that the computation of $M$ on $x$ does not halt. We know that we can implement $\mathcal{U}$ in such a way that if the computation $M(x)$ requires time $T$, then $\mathcal{U}(\langle M\rangle,x)$ requires time $O\left(T\log T\right)$. Here the $O$ notation hides constants which depend on the number of states and the alphabet size of $M$ (but independent of $|x|$). Your question then translates to whether we can implement $\mathcal{U}$ such that the computation of $\mathcal{U}(\langle M\rangle,x)$ requires only $O(T)$ time?

It seems that for single tape machines, it is not known whether this $\log T$ factor is necessary, however for $k\ge 2$ tapes machines we can avoid it (proved by Furer, 1982). See this post by Kaveh for a detailed discussion and related quotes.

score 3 · Answer 2 · answered Aug 07 '16 at 14:46

No. There is no such proof. There exists a universal Turing machine $U$ and a machine $M_0$ such that $U$ simulating $M_0$ is faster than running $M_0$ directly.

For instance, $M_0$ might implement sorting using a bubble sort. $U$ might be a universal Turing machine that has an extra check: it checks its input, and if its input is exactly $M_0$ (it has the source code of $M_0$ hardcoded and it checks for that one particular string on its input), then instead of simulating $M_0$ step-by-step, it instead branches to execute Turing machine $M_1$, which is a faster version of $M_0$ (e.g., using mergesort instead of bubble sort) that otherwise has the same behavior. Such a $U$ is universal and has the property I articulated above.

Bottom line: No, simulation by a universal Turing machine is not necessarily slower than running the Turing machine itself. There is no such theorem.

That said, I don't see why you need such a theorem. As you said, we all know that simulation is typically slower than direct execution. That ought to be enough.

Is a TM that is simulated by a universal TM theoretically inherently slower than the TM itself?

2 Answers2

Linked