Why does this mergesort variant not do Θ(n) comparisons on average?

Question

A comparison sort cannot require fewer than $\Theta (n\log n)$ comparisons on average. However, consider this sorting algorithm:

sort(array):
    if length(array) < 2:
        return array
    unsorted ← empty_array
    i ← 0
    while i < length(array) - 1:
        if array[i] > array[i + 1]:
            push(unsorted, pop(array, i + 1))
        else:
            i ← i + 1
    return merge(array, sort(unsorted))

(push(array, element) puts the new element at the end of the array and increases the array’s length by 1. pop(array, index) removes the element at that index from the array, moving all the elements at greater indices and decrementing the array’s length, and returns the removed element. merge is the same as in mergesort.)

Instead of simply splitting the array in the middle like mergesort, it splits it so that one resulting array doesn’t need to be recursively sorted. Let $n$ be the length of the array to be sorted. Applying the Master Theorem gives us

$$\begin{align*} T(n) &= T(n / b) + \mathrm{splitComparisons}(n) + \mathrm{mergeComparisons}(n) \\ &= T(n / b) + (n - 1) + n \\ &= T(n / b) + 2n - 1\,, \end{align*}$$

so $f(n) = 2n - 1$ and $a = c = 1$ in the statement of the Master Theorem.

$b$ is one over the probability that an element is greater than the next element and will go into the array to be recursively sorted. For example, if there's a 25% chance that array[i] > array[i + 1] (for all i), $b = 4$. $b$ is clearly greater than $1$, since the length of unsorted array grows smaller with every recursive call, so taking the logarithm with base $b$ of $1$ will always give us $0$, which is less than $c$. Then $T(n) = \Theta(f(n)) = \Theta(n)$.

But that can’t be true, so the Master Theorem isn't applicable for some reason; I suspect because $b$ isn't constant, but I don’t know how to prove that. The worst case of the sorting algorithm obviously requires a quadratic number of comparisons and the best case linear, so by analogy with bubblesort, insertion sort, etc, I’m guessing this algorithm also makes a quadratic number of comparisons on average.

score 4 · Accepted Answer · answered Apr 09 '15 at 20:34

Since you don't do any reordering while splitting, the length of array after the while loop can not be larger than the length of the longest increasing subsequence in the input. Since that one is on average about $2\sqrt{n}$ elements long¹, you keep too many elements in unsorted.

In particular, there is no $b$ so that your ansatz describes the actual number of comparisions.

Assuming that the probability distribution of the length of the longest increasing subsequence survives your splitting, a proper ansatz would be of the form

$\qquad\displaystyle C(n) = C(n - 2\sqrt{n}) + \Theta(n)$

which does not seem to solve² to $C \in O(n)$.

In fact, it's even worse since you keep the "first" increasing subsequence, not the longest (in general); consider for instance $[1,n,2,3,4,\dots,n-1]$; you keep $[1,n]$ and recurse on $[2,3,\dots,n-1]$.

On the distribution of the length of the longest increasing subsequence of random permutations by J. Baik, P. Deift and K. Johansson (1999) [via Wikipedia]
I have not solved it explicitly, but plots suggest as much.

D.W. · Answer 2 · 2015-04-09T16:14:07.537

You seem to be assuming that pop takes $O(1)$ time. That assumption is wrong.

You can't implement pop in $O(1)$ time, since pop requires shifting all elements after index one to the left, which is $O(n)$ time using the obvious implementation. (There are modestly better methods, but nothing that achieves $O(1)$ time like you are implicitly assuming. If you want to claim a particular running time, it's your job to specify how you plan to implement push and pop and justify their running time carefully.)

Consequently, the time complexity of your algorithm is not $O(n)$.

Tianren Liu · Answer 3 · 2015-04-09T20:21:33.303

Assuming input array is in decreasing order, then array[i] > array[i+1] is always satisfied, your algorithm would move all but the first item from array to unsorted. The running time is at least $\Omega(n^2)$.

Let array array' denote the same variable before and after the while loop. Then array[i] is left in array' if and only array[i] is (one of) the biggest among array[0], ..., array[i]. In expectation, there would be only $O(\log n)$ items left in array', assuming the input array is random.

Why does this mergesort variant not do Θ(n) comparisons on average?

3 Answers3