Defining the Y combinator in terms of S, K and I

Question

We know that the Y-combinator is defined as: $$\text{Y}:=\lambda f.(\lambda x.f(xx))(\lambda x.f(xx))$$

Wikipedia says :$$\text{Y}:=\text{S(K(SII))(S(S(KS)K)(K(SII)))}$$

Now the question is: What logical steps can we take to convert the first definition to the second?

While it is easy to show the equivalence between the two definitions, finding how the first definition can motivate and lead to the second definition is, in my opinion, a tricky task. I have added my proof as an answer, but all other ideas and suggestions are welcome.

Note: This question was marked as duplicate of another. However, in my opinion, that question discusses the intuition behind the $\lambda$-calculus definition of the Y-combinator, and nowhere does it discuss about the combinatorial representation. My question, on the other hand, asks specifically about the conversion from the $\lambda$-representation to the combinatorial representation.

I believe these to be two very different questions, since the first one is “restricted” to the Y-combinator, while my question can (in essence) be generalized to proving equivalence between any lambda term and its respective combinatorial form.

The first definition "motivates" the second because it is what you get if you apply the standard translation of the $\lambda$-calculus into combinatory logic. So I don't really understand what you are asking. — Rob Arthan, Feb 29 '24 at 20:50
@RobArthan actually I have only started learning about lambda calculus for a few months. I was unaware of a "standard translation". But thanks for your insight, I will surely try to read further about the topic. — Soham Saha, Mar 01 '24 at 08:13
Sorry if I sounded a bit harsh. You might like to look at the original work of Schönfinkel who invented combinatory logic. I think the motivation was to introduce combinators to remove the need for bound variables in logic along the same lines as the introduction of functionals (like the integration and differentiation operators) that remove the need for bound variables in analysis. — Rob Arthan, Mar 01 '24 at 20:36
There's really no need to say sorry. I know that I am inexperienced in this field, but I would like to study further. My current understanding is that the functional structure allows us to think about what is being done on a variable in a general scale, not only on a particular variable, and also allows us to pass functions as arguments. Is this idea correct? @RobArthan — Soham Saha, Mar 02 '24 at 05:29
@RobArthan also, I am still a bit confused about why removing bound variables gives us an advantage. How is it different from writing using bound variables, while permitting functions to be first class citizens? Could you clarify about that; how CL is better than lambda-calculus in some respects (other than removing the annoying $\alpha$ conversions) ? — Soham Saha, Mar 02 '24 at 05:40
Not having bound variables makes substitution of terms for free variables completely unproblematic, whereas if you are substituting inside the scope of a $\lambda$, you need to take special measures to avoid variable capture problems. I admit that the reduction to combinatory logic can make things harder to understand (your example with $Y$ becomes even worse if you remove $I$ using $I = SKK$). I don't think either of the two formalisms is better or worse than the other: they are just two interesting ways of approaching the concept of an abstract function. — Rob Arthan, Mar 02 '24 at 21:20
Going from the first definition to the second is basically applying the deduction theorem. It is a standard algorithmic technique. — DanielV, May 02 '25 at 05:24
@DanielV Actually, I was unaware of the standard conversion technique at that time, as this previous comment mentions. Still, thanks for mentioning the point. — Soham Saha, May 02 '25 at 05:53

score 4 · Accepted Answer · answered Feb 27 '24 at 05:51

Let's define $$\text{E}=\lambda\text{x. f (x x)}$$ which leads to:

$$ \begin{align*} \text{E x}&=\text{f (x x)}\\ &=\text{f (I x (I x))}\\ &=\text{f (S I I x)}\\ &=\text{(K f x) (S I I x)}\\ &=\text{S (K f) (S I I) x}\\ &=\text{(K S f) (K f) (S I I) x}\\ &=\text{S (K S) K f (S I I) x}\\ &=\text{S (K S) K f (K (S I I) f) x}\\ &=\text{S (S (K S) K)(K (S I I)) f x}\\ \therefore \text{ E}&=\text{S (S (K S) K)(K (S I I)) f}\\ &=\text{T f [Let]} \end{align*} $$

Now $\text{Y}=\lambda\text{f. E E}$, so: $$\begin{align*} \text{Y f}&=\text{E E}\\ &=\text{T f (T f)}\\ &=\text{S T T f}\\ \therefore\text{ Y}&=\text{S T T}\\ &=\text{S S I T}\\ &=\text{S S I (S (S (K S) K)(K (S I I)))}\\ &=\text{S (K (S I I)) (S (S (K S) K)(K (S I I)))} \end{align*}$$

Note: See this for why $\text{SSI}$ and $\text{S(K(SII))}$ are equivalent.

score 2 · Answer 2 · answered Mar 10 '24 at 09:23

Define $D = S I I$, $B = S (K S) K$ and $V = S B (K D)$. Then, your definition is $Y = S (K D) V$. Since $D x = S I I x = I x (I x) = x x$, then, when applied to $f$, this is equivalent to $$S (K D) V f = K D f (V f) = D (V f) = V f (V f) = S V V f.$$ Thus, under the η-rule, $$S (K D) V = λf·S (K D) Vf = λf·S V V f = S V V.$$ Actually, if you go with strict $SKI$ abstraction, then $S V V$ is what the article should be citing, not $S (K D) V$.

First, $$f(x x) = K f x (D x) = S (K f) D x\quad⇒\quadλx·f(x x) = λx·S (K f) D x = S (K f) D.$$ Second, $$S (K f) = K S f (K f) = S (K S) K f = B f,\quad D = K D f.$$ Therefore $$S (K f) D = B f (K D f) = S B (K D) f = V f.$$ Thus, it follows that $$λf·(λx·f (x x))(λx·f (x x)) = λf·(V f)(V f) = λf·S V V f = S V V.$$

Under the combinator engine Combo, which I put up on GitHub, the abstraction algorithm uses a wider range of combinators, including $D$ and $B$. If will yield $S \_0 \_0$, where $\_0 = C B D$, since $C a b = S a (K b)$ (under the η-rule), where $C = λxλyλz·x z y$. Therefore, $\_0 = V$. If you run Combo on $V = S B (K D)$ with the extensionality axiom (the η-rule) turned on, it will give you $C B D$.

If you run it on $S (C B D) (C B D)$ with, extensionality turned on, it will block it and report it as a "cyclic term", recognizing that it reduces as $Y = λf·Y f = λf·f (Y f)$, which leads to an infinite reduction. If I get around to upgrading Combo to generate, accept and process rational infinite lambda terms, then it might eventually be able to state that $Y$ has $λf·f(f(f(⋯))$ as a normal form, establishing this result in the same finite number of steps that it currently takes to recognize it's a cyclic term in its current version.

also, $SVV = SSIV$ so we'd only have to substitute the SKI-code for $V$ once, that way. — Will Ness, Mar 04 '25 at 13:49

Will Ness · Answer 3 · 2025-06-29T20:17:02.057

$Y$ combinator is defined as: $$\text{Y}\ :=\ \lambda f. (\lambda x.f(xx))\ (\lambda x.f(xx))$$

To convert it to SKI-combinators, first apply lambda lifting: $$ \lambda x.f(xx) = (\lambda f.\lambda x.f(xx))\ f\ =:\ H f$$ Writing in combinators notation, $$ H f x = f (x x) $$ $$ Y f = H f (H f) = f(Hf(Hf)) $$ Now we can convert the $Y$ and $H$ definitions by matching right-hand sides of the combinator equations

$\qquad\begin{align} I a &= a \\ K a b &= a \\ S a b c &= a c (b c) \\ S I I c &= I c (I c) = c c \end{align}$

then,

$\qquad\begin{align} H f x &= f (x x) = K f x (SII x) \\ &= S(Kf)(SII)x \\ &= S(KS)Kf(K(SII)f)x \\&= S(S(KS)K)(K(SII))fx \end{align}$

$\qquad\begin{align} \ \ Y f &= Hf(Hf) = SII(Hf) \\&= K(SII)f(Hf) \\&= S(K(SII))H f \end{align}$

$\qquad\begin{align} \ \ Y f &= Hf(Hf) = SHHf \\&= SSIHf \end{align}$

Substituting the $H$ definition gives us two possibilities, after $\eta$-contraction:

$$ Y = S(K(SII))(S(S(KS)K)(K(SII))) $$ $$ Y = SSI(S(S(KS)K)(K(SII))) $$

Applying the $S$ rule, the first definition can be further contracted to

$$ Y = SS(S(S(KS)K))(K(SII)) $$

which seems to be the shortest encoding of $Y$, also in terms of $SK$-letters only, since each $I$ counts as three (its explicit encoding shortly follows below).

Working with just $SKI$ combinators can fast become unwieldy. Adding more basic combinators helps:

$\quad\begin{align} Babc &= a(bc) = Kac(bc) = S(Ka)bc \\ Cabc &= ac(b) = ac(Kbc) = Sa(Kb)c = S(K(Sa))Kbc \\ Wab &= abb = ab(SKab) = Sa(SKa)b = SS(SK)ab \\ &= abb = ab(Ib) = SaIb = CSIab = SS(KI)ab \\ Ub &= bb = Ibb = WIb = Ib(Ib) = SIIb \\ Ia &= a = Ka(Ka) = SKKa = SKSa \end{align}$

$U$ isn't usually included as a basic combinator, but it is arguably just as fundamental. Now we can have

$\quad\begin{align} H f x &= f (xx) = f (Ux) = BfUx = CBUfx \\ H f x &= f (xx) = Bfxx = W(Bf)x = BWBfx \\ Y f &= H f (H f) = SHHf = SSIHf = WSHf \end{align}$

and thus the previous definitions were actually

$\quad\begin{align} \ \ Y &= S(KU)(SB(KU)) = BU(CBU) \\ \ \ Y &= SSI(SB(KU))) = SSI(CBU) \\ \ \ Y &= SS(SB)(KU) \end{align}$

but now we've found yet other encodings,

$\quad\begin{align} \ \ Y &= WS(CBU) \\ \ \ Y &= BU(BWB) = SSI(BWB) = WS(BWB) \end{align}$

There's also $Y = BU(CBU) = SB(CB)U = SSCBU $, but substituting the definitions in it leads to a very long $SK$-encoding, evidently.

To recap,

$\quad\begin{align} \ \ Y &= S(K(SII))(S(S(KS)K)(K(SII))) \\ \ \ Y &= SSI(S(S(KS)K)(K(SII))) \\ \ \ Y &= SS(S(S(KS)K))(K(SII)) \end{align}$

These are 22(14), 18(12), 15(11) $SK(I)$-letters encodings. They are all different encodings for the same combinator definitions $Yf = Hf(Hf)$ and $Hfx=f(xx)$.

Using combinators instead of lambda-expressions makes it much easier to come up with a definition for a fixpoint combinator, in the first place: $$ Yf = Hf(Hf) = f(Hf(Hf)) \ \ \models \ \ Hfx = f(xx) $$ $$ Y\text{′} f = HfH = f(HfH) \ \ \models \ \ Hfx = f(xfx) $$ $$ \Theta f = HHf = f(HHf) \ \ \models \ \ Hxf = f(xxf) $$

giving rise to

$$ Y = BU(CBU) = SS(S(S(KS)K))(K(SII)) $$ $$ Y\text{′} = WC(SB(C(WC))) = SSK(S(K(SS(S(SSK))))K) $$ $$ \Theta = U(B(SI)U) = SI(S(K(SI)))(SII) $$

(the $SK$-only encoding for $Y\text{′}$ is due to John Tromp, according to Wikipedia). These are 15(11), 12(12), 17(9) $SK(I)$-letters encodings.

It also holds that $$ \Theta = WI( I( B(SI)(WI)) ) $$ $$ Y\text{′} = WC( C( B(SI)(WC)) ) $$ but this leads to a bit longer, 16(14)-long encoding $$ Y\text{′} = SSK(S(K(S(S(K(SI))(SSK))))K) $$

Graham Kemp · Answer 4 · 2025-05-22T05:07:31.423

To demontrate that $\textsf Y=\textsf{S(K(SII))(S(S(KS)K)(K(SII)))}$, introduce those combinators aiming to migrate the lambda bound-variables rightwards until the expression may be $\eta$-reduced; if $\mathsf Y$ is indeed equivalent to that combinator string.

$\qquad\begin{align} \textsf Y &:= \lambda f.(\lambda x.f(xx))(\lambda x.f(xx)) \\ &~= \lambda f.(\lambda x.f(\textsf Ix(\textsf Ix)))(\lambda x.f(\textsf Ix(\textsf Ix)))&&\mathsf I\text{-introduction}\because a=\mathsf Ia \\ &~= \lambda f.(\lambda x.f(\textsf{SII}x))(\lambda x.f(\textsf{SII}x))&&\mathsf S\text{-introduction }\because ac(bc)=\mathsf Sabc \\ &~= \lambda f.(\lambda x.\textsf{K}fx(\textsf{SII}x))(\lambda x.\textsf{K}fx(\textsf{SII}x))&&\mathsf K\text{-introduction }\because a=\mathsf Kab \\ &~= \lambda f.(\lambda x.\textsf{S(K}f\textsf{)(SII)}x)(\lambda x.\textsf{S(K}f\textsf{)(SII)}x)&&\mathsf S\text{-introduction } \\ &~= \lambda f.\textsf{S(K}f\textsf{)(SII)(S(K}f\textsf{)(SII))}&&\eta\text{-reduction} \\ &~= \lambda f.\textsf{KS}f\textsf{(K}f\textsf{)(SII)(KS}f\textsf{(K}f\textsf{)(SII))}&&\mathsf K\text{-introduction } \\ &~= \lambda f.\textsf{S(KS)K}f\textsf{(SII)(S(KS)K}f\textsf{(SII))}&&\mathsf S\text{-introduction } \\ &~= \lambda f.\textsf{S(KS)K}f\textsf{(K(SII)}f\textsf{)(S(KS)K}f\textsf{(K(SII)}f\textsf{))}&&\mathsf K\text{-introduction } \\ &~= \lambda f.\textsf{S(S(KS)K)(K(SII))}f\textsf{(S(S(KS)K)(K(SII))}f\textsf{)}&&\mathsf S\text{-introduction } \\ &~= \lambda f.\textsf{I(S(S(KS)K)(K(SII))}f\textsf{)(I(S(S(KS)K)(K(SII))}f\textsf{))}&&\mathsf I\text{-introduction } \\ &~= \lambda f.\textsf{SII(S(S(KS)K)(K(SII))}f\textsf{)}&&\mathsf S\text{-introduction} \\ &~= \lambda f.\textsf{K(SII)}f\textsf{(S(S(KS)K)(K(SII))}f\textsf{)}&&\mathsf K\text{-introduction } \\ &~= \lambda f.\textsf{S(K(SII))(S(S(KS)K)(K(SII)))}f&&\mathsf S\text{-introduction} \\ &~= \textsf{S(K(SII))(S(S(KS)K)(K(SII)))}&&\eta\text{-reduction} \end{align}$

Alternatively, run through those steps in reverse order to demonstrate that the combinator string $\beta$-reduces to the definition for $\sf Y$.

Defining the Y combinator in terms of S, K and I

4 Answers4

Linked