While thinking about this other question, I realized that the following functions smoothSeq and smoothSeq'
smoothSeq :: (Integer, Integer, Integer) -> [Integer]
smoothSeq (a, b, c) = result
where
result = 1 : union timesA (union timesB timesC)
timesA = map (* a) $ result
timesB = map (* b) $ result
timesC = map (* c) $ result
smoothSeq' :: (Integer, Integer, Integer) -> [Integer]
smoothSeq' (a, b, c) = 1 : union timesA (union timesB timesC)
where
timesA = map (* a) $ smoothSeq' (a, b, c)
timesB = map (* b) $ smoothSeq' (a, b, c)
timesC = map (* c) $ smoothSeq' (a, b, c)
-- | Merge two sorted sequences discarding duplicates.
union :: (Ord a) => [a] -> [a] -> [a]
union [] ys = ys
union xs [] = xs
union (x : xs) (y : ys)
| x < y = x : union xs (y : ys)
| x > y = y : union (x : xs) ys
| otherwise = x : union xs ys
have drastically different performance characteristics:
ghci> smoothSeq (2,3,5) !! 500
944784
(0.01 secs, 311,048 bytes)
ghci> smoothSeq' (2,3,5) !! 500
944784
(11.53 secs, 3,745,885,224 bytes)
My impression is that smoothSeq is linear in memory and time (as was regularSeq) because result is shared in the recursive definition, whereas smoothSeq' is super-linear because the recursive function definition spawns a tree of computations that independently recompute multiple times the previous terms of the sequence (there is no sharing/memoization of the previous terms; similar to naive Fibonacci).
While looking around for a detailed explanation, I encountered these examples (and others)
fix f = x where x = f x
fix' f = f (fix f)
cycle xs = res where res = xs ++ res
cycle' xs = xs ++ cycle' xs
where again the non-primed version (without ' suffix) is apparently more efficient because it reuses the previous computation.
From what I can see, what differentiates the two versions is whether the recursion involves a function or data (more precisely, a function binding or a pattern binding). Is that enough to explain the difference in behavior? What is the principle behind, that dictates whether something is memoized or not? I couldn't find a definite and comprehensive answer in the Haskell 2010 Language Report, or elsewhere.
Edit: here is another simple example that I could think of:
arithSeq start step = result
where
result = start : map (+ step) result
arithSeq' start step = start : map (+ step) (arithSeq' start step)
ghci> arithSeq 10 100 !! 10000
1000010
(0.01 secs, 1,443,520 bytes)
ghci> arithSeq' 10 100 !! 10000
1000010
(1.30 secs, 5,900,741,048 bytes)
The naive recursive definition arithSeq' is way worse than arithSeq, where the recursion "happens on data".