5

One of the ways of defining the set of recursive functions is to define first a language $L$ by induction in the following way:

  • $\mathsf{Z}^1 \in L$;
  • $\mathsf{S}^1 \in L$;
  • $\mathsf{P}^n_k \in L$ for all $n, k$ with $n \ge 1$, $1 \le k \le n$;
  • if $F^m, G_1^n, \dotsc, G_m^n \in L$, then $\mathsf{C}^n[F^m, G_1^n, \dotsc, G_m^n] \in L$;
  • if $F^n, G^{n+2} \in L$ then $\mathsf{R}^{n+1}[F^n, G^{n+2}] \in L$;
  • if $F^{n+1} \in L$, then $\mathsf{M}^n[F^{n+1}] \in L$.

Then one defines a function (we could call it interpretation) which associates to every $F^n \in L$ a partial function $i(F^n) : \mathbb{N}^n \rightarrow \mathbb{N}$. The interpretation is such that, intuitively, $\mathsf{Z}^1$ corresponds to the constant zero, $\mathsf{S}^1$ to the successor, $\mathsf{P}^n_k$ to the $k$-th projection, and then we have composition, primitive recursion and minimalization. One says that a partial function $f : \mathbb{N}^n \rightarrow \mathbb{N}$ is recursive if there is some $F^n \in L$ such that $f = i(F^n)$. (This way is particularly useful if one wants to deal precisely with intensional and extensional equalities.) Let's just call $i(L)$ the set of recursive functions in the sense of this definition.

My problem is that I can see at least two ways in which we could give the interpretation of $F^n$. The one I was taught corresponds to what I think computer scientists would call eager evaluation. But we could also define the interpretation in such a way that the evaluation is lazy. The definitions are quite long, but let me illustrate the key difference between the two: if $e$ stands for eager and $l$ stands for lazy, we would have for example that $$ e(\mathsf{C}^1 [\mathsf{P}_1^2, F^1, G^1])(x) = \left\{\begin{array}{l l} e(F^1)(x) & \text{if } e(G^1)(x) \text{ is defined,} \\ \text{undefined} & \text{otherwise.} \end{array}\right.$$ $$ l(\mathsf{C}^1 [\mathsf{P}_1^2, F^1, G^1])(x) = l(F^1)(x) \quad\text{for all } x $$ i.e., one can treat the projections in two ways, either requiring that all arguments be defined in order to return the $k$-th, or not – and consequently the same applies for other functions which don't really need all their arguments (this could be made precise, but it's very technical and not so interesting).

It is not obvious, to me, that the recursive functions defined using the eager evaluation are exactly the same as those defined using the lazy evaluation. I can't find any reference for this, because most texts adopt the eager evaluation without saying much about their choice (and some are not quite clear about the interpretation as well, maybe because they interchange functions and symbols for functions). So, my question is,

Why is $e(L) = l(L)$?

I think that $e(L) \subseteq l(L)$ should be simple, but of course we can't just say that if $f = e(F^n)$, then $f = l(F^n)$. On the contrary, the other inclusion seems quite complicated: in a sense, what is needed is a way to emulate parallel computation.

I know that in order to give a proof one needs to see the explicit definitions of $e$ and $l$, but I hope that this problem has already been dealt with somewhere, and that someone could point me to good references. Anyway, if it's really necessary I'll copy them here.

Luca Bressan
  • 6,961

1 Answers1

7

When the partial recursive functions a constructed in the way you sketch, one is always assuming the eager semantics. This easier to specify (and a somewhat carefully written text should define composition in sufficient detail that it is clear that it is eager) -- and more importantly it is easier to convince oneself that the computation of a function value in the eager semantics ought to be realizable as a mechanical process, given the usual assumptions of infinite resources, etc.

However, it is well known in computer science that lazy and eager functional languages have the same strength. What can be expressed in one can be expressed in the other.

The proof of this is by simulation: We can define a lazy semantics for the language, and then program an interpreter that implements this semantics in the eager language. Thus, if we have any specification of a function, what it will do in the lazy semantics is also realized by some program in the eager semantics, namely the interpeter, as combined with the function specification using the $s$-$m$-$n$ theorem.

Conversely we can use the lazy language to program an interpreter for the eager language. So the set of functions $\mathbb N\to\mathbb N$ that can be expressed in the two formalisms are the same.

With just a bit of ingenuity one can even make interpreters for the lazy and eager semantics whose behavior is independent on whether the interpreter itself is run under the eager or lazy semantics.

The classical reference is G.D. Plotkin, Call-by-name, call-by-value, and the $\lambda$-calculus, Theoretical Computer Science 1(2), 125-159 (1975). As the title says, it uses the lambda calculus rather than recursion equations as its formal setting. I can't offhand cite a source where this is done explicitly for recursion equations in a bare-bones foundational context like mathematicians like them, but actually doing it would be on the level of a small early project for a graduate student in theoretical CS.

(There are plenty of texts that discuss implementation techniques for lazy evaluation at a more abstract level. Implementing that in any desired programming language -- such as the recursion calculus you use here -- is just a matter of programming).


Added later: What I claim to be an early graduate-student project above is just writing the interpreter. Actually proving down to the gory details that it implements the lazy semantics is going to be a formidable project for the interpreter an average graduate student would produce. Doing so would certainly be possible and some "natural" interpreters have machine-verifiable correctness proofs. However, if the goal is to produce a concrete proof of lazy/eager equivalence, I would probably structure the interpreter with such a proof in mind.

For example, the interpretation could consist of a brute-force search for a proof that the lazy semantics entails that the result must be such-and-such. Such an interpreter would be hysterically inefficient, but verifying that it is correct would be comparable to verifying that the details in the proof of Gödel's incompleteness theorem work as the explanations claim they do.


On the contrary, the other inclusion seems quite complicated: in a sense, what is needed is a way to emulate parallel computation.

Note that the usual concept of "laziness" does not include parallel computation, only postponing the evaluation of function arguments until we know they're needed. Since this is sufficient to handle your example with composition of a projection function, I've assumed that is what you meant.

However, if you're more ambitious than that and want, for example $$ l(\mathsf{C}^1 [H^2, F^1, \mathsf{P}^1_1])(x) = \begin{cases} y & \text{when }l(H^2)(z,x)=y\text{ for all } z\in\mathbb N \\ l(H^2)(l(F^1)(x), x) & \text{otherwise} \end{cases} $$ then this lazy semantics is strictly stronger than the eager semantics.

For example such a semantics would declare the halting predicate for Turing machines to be solvable: Let $H^2(n,x)$ be the function that decides if $x$ is a description of a Turing machine that halts in $n$ steps or less (this is primitive recursive), and let $F^1$ be a computation that doesn't terminate. Then the function above would terminate iff the Turing machine $x$ doesn't, so HALT and its complement would both be r.e.