The code you posted takes a few shortcuts. The mismatch with the rules of the CPS transform comes form your code being in Haskell, which features integers (0,1,2), arithmetics (+,-), definitions by cases, and recursion, while the CPS transform only covers the pure lambda calculus, without all the bells and whistles that come with a "usual" programming language.
Anyway the idea of a CPS function is that you never return the result of the function in a direct way. Rather, you pass that result to a continuation function, usually named k, which is an additional parameter to the function. Often, it is convenient to take k as the last parameter -- this is only a convention, though, and we could choose otherwise.
For instance, the constant-zero function becomes
zero n = 0
zero_cps n k = k 0
Squaring:
square n = n*n
square_cps n k = k (n*n)
Now, in the last line I cheated a bit, since there I am using * as the regular multiplication function, instead of using a CPS variant of multiplication. Your fib_cps code does the same for + and -, which are not translated into CPS.
Let's continue with this.
Now, what if we want to CPS transform this composition?
f n = square (square n)
f_cps n k = ???
Intuitively, in f_cps we want to call square_cps, take its result, and pass that to another square_cps. But we can't do that directly, since square_cps does not return its result, it passes it to the continuation. So, we need to write something like
f_cps n k = square_cps n (\m -> ...???...)
where m is the "result" of square_cps, hence m=n*n. What do we do with m? We want to square it again.
f_cps n k = square_cps n (\m -> square_cps m (\o -> ???))
here o will be the square of the square of n, hence n*n*n*n. That is the final result, so we can pass it to k, which expects the result of f.
f_cps n k = square_cps n (\m -> square_cps m (\o -> k o))
So, in CPS style, we use this "idiom":
function_cps arg1 arg2 ... arg2 (\x -> ...)
meaning "call function with the args, let x be the result, and then proceed with ...".
Code like f1 (f2 (f3 (f4 n))) becomes
f4 n (\x4 ->
f3 x4 (\x3 ->
f2 x3 (\x2 ->
f1 x2 (\x1 ->
... use x1 here ...))))
(If you happen to be familiar with monads in Haskell, you'll see that this is very similar to a monadic do block. This is not a coincidence since CPS is strictly related to the Cont monad.)
In your fib_cps code, we need to do
fib (n-1) + fib (n-2)
This is transformed into a program which evaluates fib (n-1) first (let x be its result), then fib (n-2) second (let y be its result), and finally returns x+y.
In code
fib (n-1) (\x ->
fib (n-2) (\y ->
k (x+y)))
where the last line combines the results and passes their sum to k, the continuation of the call fib_cps n k.
You might have noticed that we needed to fix an evaluation order: first fib (n-1), then fib (n-2). This order was not present in the original code. Indeed, to transform some code to CPS, we need to be more explicit about what we want to evaluate first.
In your CPS rules, application $MN$ is transformed into
$$
[[M]](\lambda m.\ [[N]](\lambda n.\ \ldots)
$$
which forces $M$ to be evaluated first, then $N$, giving an evaluation order, similarly to what happened above.
Further, the CPS term above always evaluates $N$. This is done according to the call-by-value (CBV) semantics, where function arguments are always evaluated. So, the CBV CPS transform matches such semantics, which is used by languages like Ocaml.
Note that Haskell, instead, uses a call-by-name (CBN) semantics instead, where e.g. zero (non terminating expression) returns 0 immediately without evaluating its argument. If we wanted to use CPS completely preserving Haskell's semantics, we should use another form of CPS called the CBN CPS transform, which is subtly different.
In your posted fib_cps code, CBV CPS is used, even in Haskell, since
+ always evaluates its arguments
- we are not using a CPS version of
+
- CBV CPS is arguably a bit simpler to understand
I hope this does not sound too confusing. Continuations are quite tricky to understand.