8

So given an input of lets say 10 strings, what way can we input these so we get the best or worst case for these two given sorts?

Heap sort:
best case - nlogn
worst case - nlogn

Quick sort:
best case - nlogn
worst case - n^2

Where I get confused on these two is:

  • heap- Since the best and worst case are the same does it not matter the input order? The number of comparisons and assignments will always be the same? I imagine in a heap sort it may be the same since the real work is done in the insertion, but the sorting only uses the removal of the max/min heap? Is that why?
  • quick sort- This one I don't know for sure. I'm not sure what the best case and worst case situations are for this. If its a already sorted list of 10 strings for example wouldn't we always have to choose the same amount of pivots to get complete the recursive algorithm? Any help on this explanation would really help.
David Richerby
  • 82,470
  • 26
  • 145
  • 239
aisdmsaidmas
  • 81
  • 1
  • 1
  • 2

3 Answers3

6

heap- Since the best and worst case are the same does it not matter the input order? The number of comparisons and assignments will always be the same? I imagine in a heap sort it may be the same since the real work is done in the insertion, but the sorting only uses the removal of the max/min heap? Is that why?

The number of comparisons made actually can depend on the order in which the values are given. The fact that the best and worst case are each Θ(n log n) - assuming all elements are distinct - only means that asymptotically there's no difference between the two, though they can differ by a constant factor. I don't have any simple examples of this off the top of my head, but I believe that you can construct inputs where the number of comparisons differs by a constant factor between the two approaches. Since big-O notation ignores constants, though, this isn't reflected in the best-case and worst-case analysis.

quick sort- This one I don't know for sure. I'm not sure what the best case and worst case situations are for this. If its a already sorted list of 10 strings for example wouldn't we always have to choose the same amount of pivots to get complete the recursive algorithm? Any help on this explanation would really help.

The number of pivots chosen is indeed the same regardless of the execution of the algorithm. However, the work done per pivot can vary based on what sort of splits you get. In the best case, the pivot chosen at each step ends up being the median element of the array. When this happens, there are (roughly) n comparisons done at the top layer of the recursion, then (roughly) n at the next layer because there are two subarrays of size n / 2, then there are (roughly) n at the next layer because there are four subarrays of size n / 4, etc. Since there are Θ(log n) layers and each layer does Θ(n) work, the total work done is Θ(n log n). On the other hand, consider choosing the absolute minimum of each array as a pivot. Then (roughly) n compares are done at the top layer, then (roughly) n - 1 in the next layer, then (roughly) n - 2 in the next, etc. The sum 1 + 2 + 3 + ... + n is Θ(n2), hence the worst case.

Hope this helps!

templatetypedef
  • 9,302
  • 1
  • 32
  • 62
5

Since nobody's really addressed heapSort yet:

Assuming you're using a max heap represented as an array and inserting your max elements backwards into your output array/into the back of your array if you're doing it in-place, the worst case input for heapSort is any input that forces you to "bubble down" or reheapify every time you remove an element. This happens every time you are trying to sort a set with no duplicates. It will still be Θ(n log n), as templatetypedef said. The reason the worst case is Θ(log n) for a single element is because the height of the tree will be log n, and having to traverse the entire height is the worst case, and thus it is log n. The height is log n because we have a full binary tree where every parent has two children, and hence the height of n nodes becomes log n.

This property implies that heapSort's best-case is when all elements are equal (Θ(n), since you don't have to reheapify after every removal, which takes log(n) time since the max height of the heap is log(n)). It's kind of a lousy/impractical case, though, which is why the real best case for heapsort is Θ(n log n).

Caleb Stanford
  • 7,298
  • 2
  • 29
  • 50
Mia
  • 51
  • 1
  • 1
4
  • Quick Sort

    Worst case: $\mathcal{O}(n^2)$. Lets assume the pivot element is always the right-most element: Input an already sorted list with $n$ elements. So each partitioning leads to one list with $n-1$ elements and one list with $0$ elements. Even if you choose the pivot element randomly, you can still be unlucky and always choose the maximum value in the list.

    Let $T(n)$ be the number of comparisions quicksort requires to sort a list with $n$ elements. Worst case: \begin{align} T(n) = & T(n-1) + n & \text{($T(n-1)$ recursive, $n$ to partion)}\\ = & \frac{n(n+1)}{2} \in \mathcal{O}(n) \end{align}

    Best case: $\mathcal{O}(n \log n)$. If the pivot element is chosen in such way, that it partitions the list evenly:

    \begin{align} T(n) = & 2 \ T\left(\frac{n}{2}\right) + n & (\text {2 times $\frac{n}{2}$ recursive, $n$ to partition)} \\ \in & \mathcal{O}(n \log n) & (\text{master theorem}) \end{align}

  • Heap Sort

    The worst case and best case complexity for heap sort are both $\mathcal{O}(n \log n)$. Therefore heap sort needs $\mathcal{O}(n \log n)$ comparisons for any input array. Complexity of heap sort:

    \begin{align} & \mathcal{O}(n) & (\text{build $(1,n)$ heap}) \\ + & \sum_{i=1}^{n} \mathcal{O}(\log i - \log 1) & (\text{build $(1,j)$ heap}) \\ = & \mathcal{O}(n) + \sum_{i=1}^{n} \mathcal{O}(\log i) & (\text{logarithm quotient rule}) \\ = & \mathcal{O}(n \log n) & \left(\sum_{i=1}^{n} \log i < \sum_{i=1}^{n} \log n = n \log n\right) \end{align}

Gaste
  • 848
  • 7
  • 16