10

I asked a (seed) question about sums of Landau terms before, trying to gauge the dangers of abusing asymptotics notation in arithmetics, with mixed success.

Now, over here our recurrence guru JeffE does essentially this:

$\qquad \displaystyle \sum_{i=1}^n \Theta\left(\frac{1}{i}\right) = \Theta(H_n)$

While the end result is correct, I think this is wrong. Why? If we add in all the existence of constants implied (only the upper bound), we have

$\qquad \displaystyle \sum_{i=1}^n c_i \cdot \frac{1}{i} \leq c \cdot H_n$.

Now how do we compute $c$ from $c_1, \dots, c_n$? The answer is, I believe, that we can not: $c$ has to bound for all $n$ but we get more $c_i$ as $n$ grows. We don't know anything about them; $c_i$ may very well depend on $i$, so we can not assume a bound: a finite $c$ may not exist.

In addition, there is this subtle issue of which variable goes to infinity on the left-hand side -- $i$ or $n$? Both? If $n$ (for the sake of compatibility), what is the meaning of $\Theta(1/i)$, knowing that $1 \leq i \leq n$? Does it not only mean $\Theta(1)$? If so, we can't bound the sum better than $\Theta(n)$.

So, where does that leave us? It it a blatant mistake? A subtle one? Or is it just the usual abuse of notation and we should not look at $=$ signs like this one out of context? Can we formulate a (rigorously) correct rule to evalutate (certain) sums of Landau terms?

I think that the main question is: what is $i$? If we consider it constant (as it is inside the scope of the sum) we can easily build counterexamples. If it is not constant, I have no idea how to read it.

Raphael
  • 73,212
  • 30
  • 182
  • 400

3 Answers3

5

Looks right to me in the following convention:

$S_n = \sum_{k=1}^{n} \Theta(1/k)$ is convenient notation for

There is an $f(x) \in \Theta(1/x)$ (as $x \to \infty$) such that

$S_n = \sum_{k=1}^{n} f(k)$.

Thus the $c_i$ (or with the notation in this answer $c_k$) you get, are not really dependent on $k$.

Under this interpretation, it is indeed true that $S_n = \Theta(H_n)$.

In fact, in Jeff's answer, he shows that $T(k+1) = f(k) + T(k)$ where $f \in \Theta(1/k)$, so it is consistent with the above interpretation.

The confusion seems to be arising from mentally "unrolling" the $\sum$ and presuming different functions for each occurrence of $\Theta$...

Aryabhata
  • 6,291
  • 2
  • 36
  • 47
1

I think I nailed the problem down. In essence: using Landau terms decouples the variable of the summand function from the sum's running variable. We still (want to) read them as identical, though, therefore the confusion.

To develop it formally, what does

$\qquad \displaystyle S_n \in \sum_{i=1}^n \Theta(f(i)) \qquad \qquad\qquad (1)$

really mean? Now I assume that these $\Theta$ let $i$ -- not $n$ -- to infinity; if we let $n \to \infty$, every such sum evaluates to $\Theta(n)$ (if the summands are independent of $n$ and therefore constant) which is clearly wrong. Here is a first giveaway that we to crude things: $i$ is bound (and constant) inside the sum, but we still let it go to infinity?

Translating $(1)$ (for the upper bound, the lower bound works similarly), we get

$\qquad \displaystyle \exists f_1, \dots, f_n \in \Theta(f).\ S_n \leq \sum_{i=1}^n f_i(i)$

Now it is clear that the sum-$i$ and parameter-$i$ are decoupled: we can easily define the $f_i$ so that they use $i$ as a constant. In the example from the question, we can define $f_i(j) = i \cdot \frac{1}{j} \in \Theta(1/j)$ and have

$\qquad \displaystyle \sum_{i=0}^n f_i(i) "=" \sum_{i=0}^n \Theta(1/j) = \sum_{i=0}^n \Theta(1/i)$

but the original sum clearly does not evaluate to something in $\Theta(H_n) = \Theta(\log n)$. Now exchanging $j$ for $i$ -- which is only a renaming -- in the $\Theta$ may feel strange because $i$ is not independent of $n$ resp. the sum, but if we object to that now, we should never have used $i$ inside the $\Theta$ in the first place (as that holds the same strangeness).

Note that we did not even exploit that the $f_i$ may also depend on $n$.

To conclude, the proposed identity is bogus. We can, of course, agree on conventions on how to read such sums as abbreviation of rigorous calculation. However, such conventions will be incompatible with the definition of Landau terms (together with the normal abuse of them), impossible to understand correctly without context and misleading (for beginners) at least -- but that is ultimately a matter of taste (and ruthlessness?).

It occurred to me that we can also write exactly what we mean and still make use of the convenience of Landau terms. We know that all summands come from one common function, implying that the asymptotic bounds use the same constants. This is lost when we put the $\Theta$ into the sum. So let us not put it in there and write

$\qquad \displaystyle \sum_{i=1}^n \frac{2i - 1}{i(i+1)} \in \Theta\left(\sum_{i=1}^n \frac{1}{i}\right) = \Theta(H_n)$

instead. Putting the $\Theta$ outside of the sum results in

  • a mathematically correct statement and
  • a simple term inside the $\Theta$ we can easily deal with (which is what we want here, right?).

So it seems to me that this is both a correct and a useful way of writing the matter down, and should therefore be preferred over using Landau symbols inside the sum when we mean them outside of it.

Raphael
  • 73,212
  • 30
  • 182
  • 400
-1

If each $c_i$ is a constant, then there is some $c_{max}$ such that $\forall c_i: c_i\leq c_{max}$. So clearly $$\sum c_i f(i) \leq \sum c_{max} f(i) = c_{max} \sum f(i) = O\left(\sum f(i)\right)$$ Same idea for little o.

I think the problem here is that $1/i\not=\Theta(1)$. It's $o(1/n)$ (since there is no $\epsilon$ such that $\forall i: 1/i>\epsilon$), so the overall sum will be $no(1/n)=o(1)$. And each term is $O(1)$, meaning the overall sum is $O(n)$. So no tight bounds can be found from this method.

I think your questions are:

  1. Is bounding $\sum_i^n f(i)$ by doing the little o of each term and the big o of each term then multiplying by $n$ acceptable? (Answer: Yes)
  2. Is there a better method? (Answer: Not that I know of.)

Hopefully someone else can answer #2 more clearly.

EDIT: Looking over your question again, I think you are asking

$\sum_i^n \Theta(f(n)) = \Theta(nf(n))$ ?

To which the answer is yes. In this case though, each term is not $\Theta$ of anything, so that approach falls apart.

EDIT 2: You say "consider $c_i=i$, then there is no $c_{max}$". Unequivocally true. If you say that $c_i$ is a non-constant function of $i$, then it is, by definition, non-constant.

Note that if you define it this way, then $c_i i$ is not $\Theta(i)$, it's $\Theta(i^2)$. Indeed, if you define "constant" to mean "any function of $i$", then any two functions of $i$ differ by a "constant"!

Perhaps this is an easier way to think of it: we have the sequence $1,\frac{1}{2},\dots,\frac{1}{n}$. What's the smallest term in this sequence? Well, it will depend on $n$. So we can't consider the terms as constant.

(Computer scientists are often more familiar with big-O, so it might be more intuitive to ask if $1,\dots,n$ has a constant largest term.)

To provide your proof: let $f(i_{min})$ be the smallest value of $f(i)$ in the range $1,\dots,n$. Then $$\sum_i^n f(i) \geq \sum_i^n f(i_{min}) = n f(i_{min}) = n o(f(n))$$

An analogous proof can be made for the upper bound.

Lastly, you write that $H_n = o(n)$ and as proof give that $H_n = \Theta(\log n)$. This is in fact a counter-proof: if $H_n$ is "bigger" than $n$, then it can't be "smaller" than $\log n$, which is what's required for it to be $\Theta(\log n)$. So it can't be $o(n)$.

Xodarap
  • 1,538
  • 1
  • 10
  • 17