17

Is it possible to use a sorting algorithm with a non-transitive comparison, and if yes, why is transitivity listed as a requirement for sorting comparators?

Background:

  • A sorting algorithm generally sorts the elements of a list according to a comparator function C(x,y), with

    \begin{array}{ll} C(x,y) = \begin{cases} -1 & {\text{if}}\ x\prec y \\ 0 & {\text{if}}\ x\sim y \\ +1 & {\text{if}}\ x\succ y \\ \end{cases} \end{array}

    The requirements for this comparator are, as far as I understand them:

    • reflexive: $\forall x: C(x,x)=0$
    • antisymmetric: $\forall x,y: C(x,y) = - C(y,x)$
    • transitive: $\forall x,y,z, a: C(x,y)=a \land C(y,z)=a \Rightarrow C(x,z)=a$
    • C(x,y) is defined for all x and y, and the results depend only on x and y

    (These requirements are always listed differently accross different implementations, so I am not sure I got them all right)

Now I am wondering about a "tolerant" comparator function, that accepts numbers x,y as similar if$ |x - y| \le 1$: \begin{array}{ll} C(x,y) = \begin{cases} -1 & {\text{if}}\ x\lt y-1 \\ 0 & {\text{if}}\ |x - y| \le 1 \\ +1 & {\text{if}}\ x\gt y+1 \\ \end{cases} \end{array}

Examples: both [ 1, 2, 3, 4, 5] and [1, 4, 3, 2, 5] are correctly sorted in ascending order according to the tolerant comparator ($C(x,y) \le 0$ if x comes before y in the list)
but [1, 4, 2, 3, 5] is not, since C(4,2)=1

This tolerant comparator is reflexive and antisymmetric, but not transitive.

i.e. C(1,2) = 0 , c(2,3) = 0, but C(1,3) = -1, violating transitivity

Yet I cannot think of any sorting algorithm that would fail to produce a "correctly sorted" output when given this comparator and a random list.

Is transitivity therefore not required in this case? And is there a less strict version of transitivity that is required for the sorting to work?

Related questions:

HugoRune
  • 271
  • 2
  • 7

8 Answers8

13

You asked: Can we run a sorting algorithm, feeding it a non-transitive comparator?

The answer: Of course. You can run any algorithm with any input.

However, you know the rule: Garbage In, Garbage Out. If you run a sorting algorithm with a non-transitive comparator, you might get nonsense output. In particular, there is no guarantee that the output will be "sorted" according to your comparator. So, running a sorting algorithm with a non-transitive comparator is not likely to be useful in the way you were probably hoping for.

As a counterexample, running insertion sort on the input list $[3, 2, 1]$ using your comparator would leave the list unchanged -- yet the resulting output list is not in sorted order (according to your comparator).

D.W.
  • 167,959
  • 22
  • 232
  • 500
5

Given a set of elements and a binary ordering relation, transitivity is required to totally order the elements. In fact, transitivity is even required to define a partial order on the elements. http://en.m.wikipedia.org/wiki/Total_order

You would need a much broader definition of what "sorted" means in order to sort elements without transitivity. It is hard to be self-consistent. Another answer says "In particular, there is no guarantee that the output will be 'sorted' according to your comparator." But we can actually say something much stronger. You are guaranteed that the output is not sorted according to your comparator.

Say that you have a non-transitive comparator that tells you $a<b$, $b<c$, and $c<a$. Which element is the smallest one? No matter which one you choose, your comparator will tell you another element is smaller.

Joe
  • 4,105
  • 1
  • 21
  • 38
2

It sounds as though what you want is to arrange items such that all discernible rankings are correct, but items which are close might be considered "indistinguishable". It is possible to design sort algorithms which will work with such comparisons, but unless there are limits to how many comparisons may report that things are indistinguishable, there is no way to avoid having them require N(N-1)/2 comparisons. To understand why, pick some number N and any sorting algorithm that does less than N(N-1)/2 comparisons. Then populate a list L[0..N-1], setting each element L[I] to I/N and "sort" it using your comparator (the minimum value will be 0 and the maximum (N-1)/N, so the difference will be (N-1)/N, which is less than 1).

Because there are N(N-1)/2 pairs of items that could be compared, and the sort didn't do that many comparisons, there must be some pair of items that was not directly compared against each other. Replace whichever one of these ended up being sorted first by 1, and the other with -1/N, revert all items to their initial position, and repeat the sorting operation. Every single comparison operation will yield zero, just as it did the first time, so the same comparisons will be performed and items will end up in the same sequence. For the list to be correctly sorted, the "1" would have to sort after the "-1/N" (since they differ by more than one) but since the sorting algorithm would never compare those two items directly against each other, it would have no way of knowing that.

supercat
  • 1,281
  • 8
  • 11
1

I second the answers given so far (equality and comparison operators but fulfill the mathematical requirements and not doing so can cause nasty bugs that are hard to spot). But I want to mention a method that sometimes solves the problem that people try to solve using an invalid comparator like the one mentioned in the OP. Assume you have a list of float numbers and you consider them "sufficiently equal" when they're closer than 1e-4. Then define the equality operator and the comparator based on their rounded values:

rounded(x) = 1e-4 * Round(1e4 * x) // where Round rounds to integers
x1 :== x2 iff rounded(x1) == rounded(x2)
x1 :< x2 iff rounded(x1) < rounded(x2)
..

Such and equality operator is valid for grouping methods and the corresponding comparator can be used for sorting. The results are reasonable.

When the numbers have a wide range of magnitudes you can also format each number into a string with the usual formatting options (exponential or smart with 4 significant digits or so), and define equality based on those strings:

str(x) := sprintf("%4g", x)
x1 :== x2 iff str(x1) == str(x2)
x1 :< x2 iff str(x1) != str(x2) and x1 < x2
..

Please forgive me if these snippets are not valid pseudocode, I hope they're good enough to explain the idea.

user829755
  • 111
  • 2
1

Define “correctly sorted”. Here’s an attempt: An array is k-sorted if $a_i <= a_j$ whenever $i <= j <= i+k$. If <= is transitive then any 1-sorted array is (n-1)-sorted.

With the definition of <= in the question, assume the array contains one element equal to zero, one element equal to 1+epsilon, and all other elements from epsilon to 1. Then there is exactly one pair of elements that don’t compare equal. If we don’t compare these two elements then all comparisons result in “equal”, so an algorithm that doesn’t take O(n^2) steps might produce a result that is not even two-sorted.

gnasher729
  • 32,238
  • 36
  • 56
1

Well it is very much possible to sort using some non transitive comparer. But it depends on what non transitivity means for you. For situation A > B && B > C there are 3 possible scenarios:

  • A > C (that's transitivity)
  • A >= C (soft non transitivity)
  • A and C in any relation (hard non transitivity)

If you have transitive comparer, you can use fast sorting algorithms in n*log(n). If you have a soft nontransitive comparer, you can use slower n^2 sorting algorithms. For example bubble sort:

private T[] BubbleSort<T>(IEnumerable<T> values, IComparer<T> comparer) {
    T[] result = values.ToArray();
    for(int i = result.Length - 1; i >= 0; i--) {
        for(int j = 0; j < i; j++) {
            int c = comparer.Compare(result[j], result[j + 1]);
            if(c >= 0) {
                T tmp = result[j];
                result[j] = result[j + 1];
                result[j + 1] = tmp;
            }
        }
    }
    return result;
}

For hard nontransitivity, you cannot use any sorting algorithm and expect some reasonable outcomes.

The reason why you need an n^2 algorithm is, beacause then the algo actually compares every element with one another. In n*log(n) it takes advantage of transitivity.

Note in the above algo, we swap the elements even for c == 0, so even when they are equal regarding the soft nontransitive comparer. That way all elements eventually get to the ones they can compare to.

Kirleck
  • 11
  • 1
1

Fill an array of n elements with the values n, n-1, n-2, ..., 2, 1. Then try to sort using the "straight insertion" algorithm. You'll find that each element is considered equal to the element just before it, and therefore isn't moved. The result of the "sort" is the same array.

gnasher729
  • 32,238
  • 36
  • 56
-1

In general, sorting algorithms rely on the transitive property of the comparison function to correctly order elements. Transitivity ensures that if A is less than B and B is less than C, then A must be less than C. This property guarantees that the sorting algorithm can consistently compare and order elements.

However, in your case, the tolerant comparator function violates transitivity. Despite this violation, it is still possible to use a sorting algorithm with this non-transitive comparison function. One way to achieve this is by modifying the sorting algorithm to handle the non-transitive nature of the comparator.

One approach is to perform multiple passes through the array, each time considering a different subset of the elements that are within the tolerance range. The idea is to divide the array into smaller parts and sort them separately. Within each part, the transitivity violation is still present, but we only compare elements within that specific part. By doing so, we can ensure that the sorting algorithm produces a "correctly sorted" output based on the given comparator.

For example, let's consider the array [1, 4, 2, 3, 5] with the tolerant comparator. We can perform two passes. In the first pass, we consider elements within a tolerance range of 1. This would result in the following subsets: [1, 2], [4, 3], [5]. Within each subset, the tolerance is satisfied, and we can sort them independently. After sorting each subset, we combine them back together, resulting in [1, 2, 3, 4, 5], which is correctly sorted based on the given comparator.

So, to summarize, while the tolerant comparator violates transitivity, it is still possible to use a sorting algorithm by modifying it to handle the non-transitive nature. By dividing the array into subsets and sorting them separately, we can ensure a correctly sorted output based on the given comparator.