1

As I understand, a quick-select algorithm could use median of medians to find best suited pivot to yield the i-th item in the array, say A.

I have referred to Median of Medians algorithm and steps given here for deterministic selection.

This is the pseudo code as I understood for the sake of algorithm.

define partition5(arr)
    Returns middle of the array

define median_of_medians(arr, N)
    Returns the median index
    if # of elements are 5 or less just find the median
        partition5(arr)

    get n/5 slices     
    create array of slices

    fill array with ranges of slices from 0 to n in step=5

    find the median for each of slice of len 5 or less using partition5

    find the median of this median array recursively


define a swap function

define partition(ar, lo, hi, pivotIndex):
    i.e. a 3 way djisktra partition method
    start = lo
    take the elemet @ hi as the pivot and swap it to pivotIndex position
    swap(pivotIndex, hi, ar)
    pivotIndex = hi
    eq = lo

    loop range
        if element == pivot
            inc eq
        if element < pivot
            swap index, lo
            inc lo
            inc eq
    swap(lo, pivotIndex, ar)
    return lo

find the kth smallest element using
define quickSelect(arr, startIndex, endIndex, k)
    get start, end index range
    if (startIndex <= endIndex):
        find pivot using median_of_medians()
        divide using Linear-partitioning partition(ar, startIndex, endIndex, pivot)
        if ( pivotIndex == k ): return arr[pivotIndex]

        if (pivotIndex > k):
                recurse using quickSelect [startIndex, pivotIndex-1]
        else if (pivotIndex < k):
                recurse using quickSelect [pivotIndex+1, endIndex]

This is how I understood how quickSelect could be implemented using median of medians.

def partition5(arr):
    '''Returns middle of the array'''
    return len(arr)/2

def median_of_medians(arr, n, orig_arr=None):
    '''Returns the median index'''
    if n < 6:
        # if # of elements are 5 or less just find the median
        return partition5(arr)
    orig_arr = orig_arr or arr

    # find the numbers of slices
    slot_num = n/5 + (1 if n%5 else 0)

    # create array of slices
    median_slots = ([None] * slot_num)
    # and a null aux array
    aux = [None]*slot_num

    count = 0
    # fill the aux array with extremum ranges of slices
    for slot_index in xrange(0,n,5):
        aux[count] = (slot_index,(n-slot_index)%n + 5)
        count += 1

    count = 0
    # find the median for each slice of len 5 or less
    for r in aux:
        median_slots[count] = partition5(sorted(xrange(r[0], r[1]), key = lambda x: orig_arr[x]))
        count += 1

    # print "median_slots is {}".format(median_slots)
    # find the median of this median array
    return median_of_medians(median_slots, len(median_slots), orig_arr)

def swap(findex, sindex, ar):
    ar[findex], ar[sindex] = ar[sindex], ar[findex]

def partition(ar, lo, hi, pivotIndex):
    '''3 way djisktra partition method'''
    start = lo
    # take the elemet @ hi as the pivot and swap it to pivotIndex position
    swap(pivotIndex, hi, ar)
    pivotIndex = hi
    pivot = ar[pivotIndex]
    eq = lo
    for index in xrange(lo, hi):
        if (ar[eq] == pivot):
            eq += 1
        if (ar[index] < pivot and index < pivotIndex):
            swap(index, lo, ar)
            lo += 1
            eq +=1
    swap(lo, pivotIndex, ar)
    return lo

# find the kth smallest element by comparing the returned pivot index
def quickSelectIter(arr, startIndex, endIndex, k):
    stack = deque([[startIndex,endIndex]], log(endIndex-startIndex+1, 2))
    pivot = pivotIndex=0

    while (stack):
        pop = stack.pop()
        startIndex, endIndex = pop[0], pop[1]
        if (startIndex <= endIndex):
            pivot = median_of_medians(arr, endIndex-startIndex+1)
            # divide using Linear-partitioning
            pivotIndex = partition(arr, startIndex, endIndex, pivot)
            if ( pivotIndex == k ): return arr[pivotIndex]

            if (pivotIndex > k):
                stack.appendleft([startIndex, pivotIndex-1])
            elif (pivotIndex < k):
                stack.appendleft([pivotIndex+1, endIndex])

Does it look like implementing this would yield O(N) complexity?

It seems correct while running but for some cases it is not.

Lets say,

A = [34, 40, 9, 20, 62, 89, 68, 0, 90, 83, 46, 98, 6, 41, 73, 99, 35, 82, 36, 53, 70, 27, 93, 54, 64, 52, 18, 85, 58, 69, 24, 49, 25, 30, 26, 79, 55, 1, 78, 7, 8, 28, 4, 38, 71, 10, 84, 72, 50, 29, 87, 51, 37]
    l = sorted(A)
    # res= median_of_medians(A, len(A))
    # print A[res]
    for index in xrange(len(A)):
        # the i-th element
        print quickSelectIter(A, 0, len(A)-1, index+1), l[index]

so there's a bug, conceptually. What would be the Fix?

user2290820
  • 111
  • 3

0 Answers0