4

In the book Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne, when they are analyzing quicksort (page 294), they present the sequence of transformations: $$\begin{gather*} C_N = N + 1 + (C_0 + C_1 + \dots + C_{N-2} + C_{N-1})/N + (C_{N-1} + C_{N-2} + \dots + C_0)/N\\ NC_N = N(N+1) + 2(C_0 + C_1 + \dots + C_{N-2} + C_{N-1})\\ NC_N - (N-1)C_{N-1} = 2N + 2C_{N-1}\\ C_N/(N+1) = C_{N-1}/N + 2/(N+1)\\ C_N\sim 2(N+1)(1/3 + 1/4 + \dots + 1/(N+1))\end{gather*}$$

How did they get the last transformation?

It is also written that the parenthesized quantity in the last expression is the discrete estimate of the area under the curve $2/x$ from $3$ to $N$? How is it related to quicksort?

D.W.
  • 167,959
  • 22
  • 232
  • 500

1 Answers1

4

Let $\rho_N = C_N/(N+1)$. The next to last equation shows that $$ \rho_N = \rho_{N-1} + \frac{2}{N+1}. $$ Therefore $$ \rho_N = \frac{2}{N+1} + \cdots + \frac{2}{2+1} + \rho_1. $$ Multiplying by $N+1$, $$ C_N = (N+1) \left(\frac{2}{3} + \cdots + \frac{2}{N+1} + \rho_1\right). $$ Since $\rho_1$ is a constant whereas $2/3 + \cdots + 2/(N+1) \to \infty$, we deduce that $$ C_N \sim (N+1) \left(\frac{2}{3} + \cdots + \frac{2}{N+1}\right). $$ We can estimate this expression in many ways, for example by approximating the series by an integral (this is what the authors suggest when they mention the area under the curve $2/x$). Or we can recognize that it is roughly equal to $2H_{N+1}$, twice the $(N+1)$st harmonic number, and so using the well-known estimate $H_N \sim \log N$, $$ C_N \sim 2N\log N. $$

Finally, regarding the relevance to Quicksort, my guess is that $C_N$ is the average number of comparisons performed on a random array of length $N$; but you should be able to tell by reading the book.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514