2

I have algorithm that finds if there are two elements in sorted array that have sum zero.

1.ZeroSumPair(A[1..n]) // A[1..n] <-- sorted
2.    l <- 1, r <- n
3.    while(l < r)
4.        while(l < r or A[l] + A[r] > 0)
5.            r--
6.        if(A[l] + A[r] = 0)
7.            return true
8.        l++
9.    return false

Intuitively I know that this algorithm is $O(n)$, but how do I deduct it using proof with summations like in CLRS book?

I've also saw How to develop an $O(N)$ algorithm solve the 2-sum problem?, but I didn't see any formal proof.

kuskmen
  • 194
  • 1
  • 12

1 Answers1

2

Consider this slightly modified algorithm where I've added an "operation counter" t. This will be incremented every time we do a comparison or assignment.

1. ZeroSumPair(A[1..n]) // A[1..n] <-- sorted
2.   l <- 1, r <- n
3.   t <- 3             // 2 for first two assignments and 1 for initial while check
4.   while(l < r)
5.     t++              // 1 for initial while check
6.     while(l < r or A[l] + A[r] > 0)
7.       t++            // 1 for decrementing r
8.       r--
9.       t++            // 1 for following while check
10.    t++              // 1 for if comparison
11.    if(A[l] + A[r] = 0)
12.      return true
13.    t++              // 1 for incrementing l
14.    l++
15.    t++              // 1 for following while check
16.  return false

This is a bit verbose, but it will work. Now we must simply prove that, at termination we have $t = O(n)$. We can do this inductively with a loop invariant.

Let's use the following loop invariant for the loop on line 4.

$$t = 3 + 4(l-1) + 2(n-r)$$

Base Case

Initially $t = 3$, $l = 1$, and $r = n$. Thus we have:

$$3 = 3 + 4(1 - 1) + 2(n - n) = 3$$

Inductive Case

Let $t'$, $l'$, and $r'$ be the values of $t$, $l$, and $r$ at the end of our previous iteration. At the end of our current iteration we have $l = l' + 1$, $r = r' - k$ for some $k$, and $t = t' + 4 + 2k$. Thus we have:

$$\begin{align*} t & = t' + 4 + 2k\\ & = 3 + 4(l' - 1) + 2(n - r') + 4 + 2k\\ & = 3 + 4(l - 2) + 2(n - (r + k)) + 4 + 2k\\ & = 3 + 4(l - 1) - 4 + 2(n - r) - 2k + 4 + 2k\\ & = 3 + 4(l - 1) + 2(n - r) & \square \end{align*}$$

Thus, we can conclude the loop invariant holds. At the end of the loop (in the worst case) we have $l = r \leq n$. We then have: $$\begin{align*} t & = 3 + 4(l - 1) + 2(n - r)\\ & = 3 + 4l - 4 + 2n - 2l\\ & = 2(n + l) - 1\\ & \leq 2(n + n) - 1\\ & = 4n - 1\\ & = O(n) & \square \end{align*}$$

Thus it is linear. There might be an easier way to do this, but this way makes sense to me pretty clearly.

ryan
  • 4,533
  • 1
  • 16
  • 41