65

I have come across many sorting algorithms during my high school studies. However, I never know which is the fastest (for a random array of integers). So my questions are:

  • Which is the fastest currently known sorting algorithm?
  • Theoretically, is it possible that there are even faster ones? So, what's the least complexity for sorting?
gen
  • 991
  • 1
  • 8
  • 15

8 Answers8

49

In general terms, there are the $O(n^2)$ sorting algorithms, such as insertion sort, bubble sort, and selection sort, which you should typically use only in special circumstances; Quicksort, which is worst-case $O(n^2)$ but quite often $O(n\log n)$ with good constants and properties and which can be used as a general-purpose sorting procedure; the $O(n\log n)$ algorithms, like merge-sort and heap-sort, which are also good general-purpose sorting algorithms; and the $O(n)$, or linear, sorting algorithms for lists of integers, such as radix, bucket and counting sorts, which may be suitable depending on the nature of the integers in your lists.

If the elements in your list are such that all you know about them is the total order relationship between them, then optimal sorting algorithms will have complexity $\Omega(n\log n)$. This is a fairly cool result and one for which you should be able to easily find details online. The linear sorting algorithms exploit further information about the structure of elements to be sorted, rather than just the total order relationship among elements.

Even more generally, optimality of a sorting algorithm depends intimately upon the assumptions you can make about the kind of lists you're going to be sorting (as well as the machine model on which the algorithm will run, which can make even otherwise poor sorting algorithms the best choice; consider bubble sort on machines with a tape for storage). The stronger your assumptions, the more corners your algorithm can cut. Under very weak assumptions about how efficiently you can determine "sortedness" of a list, the optimal worst-case complexity can even be $\Omega(n!)$.

This answer deals only with complexities. Actual running times of implementations of algorithms will depend on a large number of factors which are hard to account for in a single answer.

Patrick87
  • 12,924
  • 1
  • 45
  • 77
19

The answer, as is often the case for such questions, is "it depends". It depends upon things like (a) how large the integers are, (b) whether the input array contains integers in a random order or in a nearly-sorted order, (c) whether you need the sorting algorithm to be stable or not, as well as other factors, (d) whether the entire list of numbers fits in memory (in-memory sort vs external sort), and (e) the machine you run it on.

In practice, the sorting algorithm in your language's standard library will probably be pretty good (pretty close to optimal), if you need an in-memory sort. Therefore, in practice, just use whatever sort function is provided by the standard library, and measure running time. Only if you find that (i) sorting is a large fraction of the overall running time, and (ii) the running time is unacceptable, should you bother messing around with the sorting algorithm. If those two conditions do hold, then you can look at the specific aspects of your particular domain and experiment with other fast sorting algorithms.

But realistically, in practice, the sorting algorithm is rarely a major performance bottleneck.

D.W.
  • 167,959
  • 22
  • 232
  • 500
12

Furthermore, answering your second question

Theoretically, is it possible that there are even faster ones?
So, what's the least complexity for sorting?

For general purpose sorting, the comparison-based sorting problem complexity is Ω(n log n). There are some algorithms that perform sorting in O(n), but they all rely on making assumptions about the input, and are not general purpose sorting algorithms.

Basically, complexity is given by the minimum number of comparisons needed for sorting the array (log n represents the maximum height of a binary decision tree built when comparing each element of the array).

You can find the formal proof for sorting complexity lower bound here:

rla4
  • 371
  • 2
  • 7
3

The fastest integer sorting algorithm in terms of worst-case I have come across is the one by Andersson et al. It has a worst-case of $O(n\log\log n)$, which is of course faster than $O(n\log n)$.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
user39994
  • 39
  • 1
3

For integer sorting, the best known result seems to be: $$O(n \sqrt{log{ log{n}}})$$ in expectation using a randomized algorithm (or $O(n \sqrt{log{ log{U}}})$ if given an upper bound $U$), via Han, Thorup.

thefool
  • 31
  • 1
1

I read through the other two answers at the time of writing this and I didn't think either one answered your question appropriately. Other answers considered extraneous ideas about random distributions and space complexity which are probably out of the scope for high school studies. So here is my take.

Given an array $A$ with $n$ integer elements, you need exactly $(n-1)$ comparisons between elements in order to check if $A$ is sorted (just start at the beginning of the array and check the next element against the last element). In fact, $(n-1)$ comparisons is the best case running time for any sorting algorithm. In other words, the running time lower boundary for any sorting algorithm is $\Omega(n)$. If you recall radix sort or bucket sort, you will notice that their running times are $O(n)$. Since all sorting algorithms are bound below by $\Omega(n)$, I would argue that both radix sort and bucket sort are the fastest algorithms for sorting an array of integers.

Additionally, if you are not familiar with what $\Omega(n)$ or $O(n)$: Both notations mean that the algorithm takes approximately $n$ operations to complete (could be $2n$ or $3n-5$, but not $1$ or $n^2$ operations).

0

If you only allow making decisions by means of comparison of the keys, it is well-known that at least $\log(n!)$ comparisons are required in the worst case, to identify the comparison at hand among all the possible ones. This is an unbreakable bound.

If you allow other operations than comparisons, the trivial bound $\Omega(n)$ holds (and can be reached in special cases), as you have to read all the keys. This is unbeatable.

0

As you don't mention any restrictions on hardware and given you're looking for "the fastest", I would say you should pick one of the parallel sorting algorithm based on available hardware and the kind of input you have.

In theory e.g. quick_sort is O(n log n). With p processors, ideally this should come down to O(n/p log n) if we run it in parallel.

To quote Wikipedia: Time complexity of...

Optimal parallel sorting is O(log n)

In practice, for massive input sizes it would be impossible to achieve O(log n) due to scalability issues.

Here is the pseudo code for Parallel merge sort. Implementation of merge() can be same as in normal merge sort:

// Sort elements lo through hi (exclusive) of array A.
algorithm mergesort(A, lo, hi) is
    if lo+1 < hi then  // Two or more elements.
        mid = ⌊(lo + hi) / 2⌋
        fork mergesort(A, lo, mid)
        mergesort(A, mid, hi)
        join
        merge(A, lo, mid, hi)

Also see:

Kashyap
  • 101
  • 3