in O(n) time: Find greatest element in set where comparison is not transitive

Question

Title states the question.

We have as inputs a list of elements, that we can compare (determine which is greatest). No element can be equal.

Key points:

Comparison is not transitive (think rock paper scissors): this can be true: A > B, B > C, C > A (note this is not a valid input since there is no valid answer here, I'm only describing what "non-transitive comparison" means)
Each input array will is guaranteed to have an answer
greatest means the element must be greater than every other element
Converse property holds i.e. A > B implies that B < A

Example:

Input: [A,B,C,D]
A > B, B > C, C > A
D > A, D > B, D > C
Output: D

I cannot figure out a way to do this in O(n) time, my best solution is O(n^2).

I get stuck on every approach because of the fact that to be certain of an answer, the element needs to be explicitly compared to every other element, to prove it is indeed the answer (because comparison is not transitive).

This rules out the use of a heap, sorting, etc.

Ariel · Answer 1 · 2017-11-22T17:06:27.503

The standard algorithm for finding a maximum still works. Start with $a_1$ and go over the elements, if you see a larger value, update the maximum to be that value. The reason this works is that every element you skipped is smaller than at least one element, and can thus not be the maximum.

To be clear, by the "standard algorithm" I mean the following:

max <- a_1
for i=2 to n
   if a_i > max
      max <- a_i
output max

For completeness, I will discuss here the issues raised in the comments. The setting in the above discussion is finding a maximum relative to an anti symmetric relation $<$, where $a_i$ is a maximum if for all $j\neq i$ we have $a_i>a_j$. The above algorithm works under the assumption that a maximum exists, however if this is not known, one can use it to verify the existence of a maximum (check whether the returned element is indeed greater than all other elements, this is mentioned in Chi's comment and in Ilmari Karonen answer).

If $<$ is not necessarily anti symmetric, then the above algorithm fails (as Emil mentioned in the comments). If $<$ is an arbitrary relation (i.e. we are relaxing both transitivity and anti symmetry), then it is not hard to show that finding a maximum in linear time is not possible. Denote by $\#a_i$ the number of times $a_i$ participated in a query, we define an adversarial relation in a way that the maximum cannot be revealed without enough queries. Given the query $a_i >?a_j $, answer $a_i>a_j$ if $\# a_i < n-1$ and $a_i<a_j$ otherwise. If the number of queries is $o(n^2)$, then a maximum was not yet seen, and it can be set to be either of the elements in the set.

Ilmari Karonen · Answer 2 · 2017-11-22T02:19:33.603

As Ariel notes, the standard maximum-finding algorithm given below:

def find_maximum(a):
    m = a[0]
    for x in a:
        if x > m: m = x
    return m

will in fact work without modification as long as:

any pair of elements can be compared, and
the input is guaranteed to contain a maximal element, i.e. an element that is pairwise greater than any other element in the input.

^{(The first assumption above can actually be relaxed, even without having to modify the algorithm, as long as we assume that the maximal element is comparable with every other element and that x > y is always false if the elements x and y are incomparable.)}

In particular, your claim that:

[…] to be certain of an answer, the element needs to be explicitly compared to every other element (because comparison is not transitive).

is not true under the assumptions given above. In fact, to prove that the algorithm above will always find the maximal element, it's sufficient to observe that:

since the loop iterates over all the input elements, at some iteration x will be the maximal element;
since the maximal element is pairwise greater than every other element, it follows that, at the end of that iteration, m will be the maximal element; and
since no other element can be pairwise greater than the maximal element, it follows that m will not change on any of the subsequent iterations.

Therefore, at the end of the loop, m will always be the maximal element, if the input contains one.

Ps. If the input does not necessarily always contain a maximal element, then verifying that fact will indeed require testing the candidate answer against every other element to verify that it is really maximal. However, we can still do that in O(n) time after running the maximum-finding algorithm above:

def find_maximum_if_any(a):
    # step 1: find the maximum, if one exists
    m = a[0]
    for x in a:
        if x > m: m = x

    # step 2: verify that the element we found is indeed maximal
    for x in a:
        if x > m: return None  # the input contains no maximal element
    return m  # yes, m is a maximal element

^{(I'm assuming here that the relation > is irreflexive, i.e. no element can be greater than itself. If that's not necessarily the case, the comparison x > m in step 2 should be replaced with x ≠ m and x > m, where ≠ denotes identity comparison. Or we could just apply the optimization noted below.)}

To prove the correctness of this variation of the algorithm, consider the two possible cases:

If the input contains a maximal element, then step 1 will find it (as shown above) and step 2 will confirm it.
If the input does not contain a maximal element, then step 1 will end up picking some arbitrary element as m. It doesn't matter which element it is, since it will in any case be non-maximal, and therefore step 2 will detect that and return None.

If we stored the index of m in the input array a, we could actually optimize step 2 to only check those elements that come before m in a, since any later elements have already been compared with m in step 1. But this optimization does not change the asymptotic time complexity of the algorithm, which is still O(n).

Corinna · Answer 3 · 2017-11-22T14:12:24.880

As an addition to Ariel's answer about the concerns raised by Emil Jeřábek: If we allow $A<B$ and $B<A$ then there is no O(n) algorithm:

Assume you have elements $A_1 ... A_n$. Your algorithm will in each step query $A_i<A_j$ for some pair i and j. No matter in which order you query them, there is always a relation where you have to query all relations before finding the maximum. The reason for this is best described by assuming you have an adversary who can change the problem while your algorithm is running.

(Note that the argument is independent of what the algorithm actually does with the information he gets about the elements, since it explains that he cannot know that an element is maximal before making $n^2$ queries.)

For most of the algorithm the adversary will make sure to always return true for $A_i<A_j$ until your last query for a given $j$. Note that you cannot know that one given element is maximal until you compared it to all other elements. Only for the last element for which you finish all relations the adversary will return true for the last element as well.

The resulting relation will always be such that there is some $j_0$ for which $A_i < A_{j_0}\forall i$ and for all other $j$ there will be some $i_j$ such that $A_i<A_j \forall i\neq i_j$ and we will not have $A_{i_j} < A_j$. The adversary chooses $j_0$ and the $i_j$s depending on your algorithm.

I hope this is somewhat understandable. Feel free to ask in comments or edit.

The basic idea is that you cannot gather any information about the remaining elements from the ones you already know if you allow a completely arbitrary relation.

The argument still works if we disallow $A<A$. We will only save $n$ queries that way and still need $n^2-n$.

Danikov · Answer 4 · 2017-11-22T13:03:49.073

"greatest means the element must be greater than every other element" is a huge hint on how to do this in $O(n)$.

If you go through your list comparing elements, any element that "loses" a comparison can be immediately discarded as, in order to be the greatest, it must be greater than ALL other elements so the single loss disqualifies it.

Once you think of it in that manner, then it's pretty obvious that you actually can make $n-1$ comparisons and end up with the greatest element as the result of your last comparison, by discarding a loser each and every time.

This solution is enabled by a subtlety: "No element can be equal" combined with the fact that there will always be a greatest element. If we map wins relationships as a directed graph, it is clear that we can reach the greatest element simply by following the wins.

fade2black · Answer 5 · 2017-11-20T20:17:18.310

I assume that the relation antisymmetric for at least a single element (which guarantees the existence of the greatest element), otherwise the task is impossible. If all elements in the finite set are comparable then usual finding-maximum procedure works.

If some elements are not comparable then the following procedure will work

max = nil
For i=1 to n
   if max is nil then
      max = a[i]
   if max and a[i] are not comparable then
      max = nil
   else if max < a[i] then
      max = a[i]
End

This is how the algorithm will work on input $A,B,C,D$ with $$A > B, B > C, C > A $$ $$D > A, D > B, D > C$$ as in your post.

Initially. max = nil
$i=1:$ $\max = A$
$i=2:$ $\max = A$ (since $A > B$)
$i=3:$ $\max = C$ (since $A < C$)
$i=4:$ $\max = D$ (since $D > C$)

This algorithm works since if $m>a$ (or $a < m$) for all $a$ then $m$ is the greatest. If $m<a$ for some element $a$ then $m$ cannot be the greatest. Similarly, if $a$ and $m$ are not comparable then they both cannot be the greatest. This procedure would work even if all elements are comparable.

score 1 · Answer 6 · answered Nov 21 '17 at 12:46

I'm going to cheat and call the number of A > B entries $n$ and you need to look at every entry at least once.

That way you can loop over just the entries and remove the element that is less than the other from the set of possible solutions. With a hashset or similar this is possible in $O(n)$

score 0 · Answer 7 · edited Jan 18 '21 at 21:11

0

One approach is you can create hashmap.

Suppose that $a > b$, then $m[b] = m[a] + 1$, i.e., $1$.

If $b > c$ then $m[c] = m[b] + 1$, i.e., $2$.

If $a > e$ then $m[e] = m[a] + 1$, i.e., $1$.

In this way, traverse the hashmap, and find the element whose value is zero.

edited Jan 18 '21 at 21:11

Yuval Filmus

280,205
27
317
514

answered Jan 18 '21 at 06:31

Aditya Loke

1

in O(n) time: Find greatest element in set where comparison is not transitive

7 Answers7