48

I totally understand what big $O$ notation means. My issue is when we say $T(n)=O(f(n))$ , where $T(n)$ is running time of an algorithm on input of size $n$.

I understand semantics of it. But $T(n)$ and $O(f(n))$ are two different things.

$T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$ equals $O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

Raphael
  • 73,212
  • 30
  • 182
  • 400
doubleE
  • 591
  • 1
  • 4
  • 8

10 Answers10

108

Strictly speaking, $O(f(n))$ is a set of functions. So the value of $O(f(n))$ is simply the set of all functions that grow asymptotically not faster than $f(n)$. The notation $T(n) = O(f(n))$ is just a conventional way to write that $T(n) \in O(f(n))$.

Note that this also clarifies some caveats of the $O$ notation. For example, we write that $(1/2) n^2 + n = O(n^2)$, but we never write that $O(n^2)=(1/2)n^2 + n$. To quote Donald Knuth (The Art of Computer Programming, 1.2.11.1):

The most important consideration is the idea of one-way equalities. [...] If $\alpha(n)$ and $\beta(n)$ are formulas that involve the $O$-notation, then the notation $\alpha(n)=\beta(n)$ means that the set of functions denoted by $\alpha(n)$ is contained in the set denoted by $\beta(n)$.

Vincenzo
  • 3,469
  • 1
  • 13
  • 24
44

$O$ is a function $$\begin{align} O : (\mathbb{N}\to \mathbb{R}) &\to \mathbf{P}(\mathbb{N}\to \mathbb{R}) \\ f &\mapsto O(f) \end{align}$$ i.e. it accepts a function $f$ and yields a set of functions that share the asymptotic bound of (at most) $f$. And strictly speaking the correct notation is thus $$ (n \mapsto T(n)) \in O(n\mapsto f(n)) $$ or short $$ T \in O(f) $$ but it's customary in maths, science and CS to just use a variable somewhere in the expression to denote that you're considering functions of the argument $n$ on both sides. So $T(n) \in O(f(n))$ is quite fine as well. $T(n) = O(f(n))$ is pretty much wrong though, as you suspected. It is very commonly used though, so definitely keep in mind what people mean when they write this.

I would advise against ever writing $T(n) = O(f(n))$, but opinions differ.

leftaroundabout
  • 1,691
  • 13
  • 12
13

Formally speaking, $O(f(n))$ is a the set of functions $g$ such that $g(n)\leq k\,f(n)$ for some constant $k$ and all large enough $n$. Thus, the most pedantically accurate way of writing it would be $T(n)\in O(f(n))$. However, using $=$ instead of $\in$ is completely standard, and $T(n)=O(f(n))$ just means $T(n)\in O(f(n))$. This is essentially never ambiguous because we almost never manipulate the set $O(f(n))$.

In a sense, using equality makes $O(f(n))$ mean "some function $g$ such that $g(n)\leq f\,g(n)$ for all large enough $n$", and this means that you can write things like $f(n) = 3n + O(\log n)$. Note that this is much more precise than, e.g., $f(n)=\Theta(n)$ or $f(n)=O(n+\log n)$.

David Richerby
  • 82,470
  • 26
  • 145
  • 239
11

Prologue: The big $O$ notation is a classic example of the power and ambiguity of some notations as part of language loved by human mind. No matter how much confusion it have caused, it remains the choice of notation to convey the ideas that we can easily identify and agree to efficiently.

I totally understand what big $O$ notation means. My issue is when we say $T(n)=O(f(n))$ , where $T(n)$ is running time of an algorithm on input of size $n$.

Sorry, but you do not have an issue if you understand the meaning of big $O$ notation.

I understand semantics of it. But $T(n)$ and $O(f(n))$ are two different things. $T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$ equals $O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

What is important is the semantics. What is important is (how) people can agree easily on (one of) its precise interpretations that will describe asymptotic behavior or time or space complexity we are interested in. The default precise interpretation/definition of $T(n)=O(f(n))$ is, as translated from Wikipedia,

$T$ is a real or complex valued function and $f$ is a real valued function, both defined on some unbounded subset of the real positive numbers, such that $f(n)$ is strictly positive for all large enough values of $n$. For for all sufficiently large values of $n$, the absolute value of $T(n)$ is at most a positive constant multiple of $f(n)$. That is, there exists a positive real number $M$ and a real number $n_0$ such that

${\text{ for all }n\geq n_{0}, |T(n)|\leq \;Mf(n){\text{ for all }}n\geq n_{0}.}$

Please note this interpretation is considered the definition. All other interpretations and understandings, which may help you greatly in various ways, are secondary and corollary. Everyone (well, at least every answerer here) agrees to this interpretation/definition/semantics. As long as you can apply this interpretation, you are probably good most of time. Relax and be comfortable. You do not want to think too much, just as you do not think too much about some of the irregularity of English or French or most of natural languages. Just use the notation by that definition.

$T(n)$ is an exact number, But $O(f(n))$ is not a function that spits out a number, so technically we can't say $T(n)$ equals $O(f(n))$, if one asks you what's the value of $O(f(n))$, what would be your answer? There is no answer.

Indeed, there could be no answer, since the question is ill-posed. $T(n)$ does not mean an exact number. It is meant to stand for a function whose name is $T$ and whose formal parameter is $n$ (which is sort of bounded to the $n$ in $f(n)$). It is just as correct and even more so if we write $T=O(f)$. If $T$ is the function that maps $n$ to $n^2$ and $f$ is the function that maps $n$ to $n^3$, it is also conventional to write $f(n)=O(n^3)$ or $n^2=O(n^3)$. Please also note that the definition does not say $O$ is a function or not. It does not say the left hand side is supposed to be equal to the right hand side at all! You are right to suspect that equal sign does not mean equality in its ordinary sense, where you can switch both sides of the equality and it should be backed by an equivalent relation. (Another even more famous example of abuse of the equal sign is the usage of equal sign to mean assignment in most programming languages, instead of more cumbersome := as in some languages.)

If we are only concerned about that one equality (I am starting to abuse language as well. It is not an equality; however, it is an equality since there is an equal sign in the notation or it could be construed as some kind of equality), $T(n)=O(f(n))$, this answer is done.

However, the question actually goes on. What does it mean by, for example, $f(n)=3n+O(\log n)$? This equality is not covered by the definition above. We would like to introduce another convention, the placeholder convention. Here is the full statement of placeholder convention as stated in Wikipedia.

In more complicated usage, $O(\cdots)$ can appear in different places in an equation, even several times on each side. For example, the following are true for $n\to \infty$.

$(n+1)^{2}=n^{2}+O(n)$
$(n+O(n^{1/2}))(n+O(\log n))^{2}=n^{3}+O(n^{5/2})$
$n^{O(1)}=O(e^{n})$

The meaning of such statements is as follows: for any functions which satisfy each $O(\cdots)$ on the left side, there are some functions satisfying each $O(\cdots)$ on the right side, such that substituting all these functions into the equation makes the two sides equal. For example, the third equation above means: "For any function $f(n) = O(1)$, there is some function $g(n) = O(e^n)$ such that $n^{f(n)} = g(n)$."

You may want to check here for another example of placeholder convention in action.

You might have noticed by now that I have not used the set-theoretic explanation of the big $O$-notation. All I have done is just to show even without that set-theoretic explanation such as "$O(f(n))$ is a set of functions", we can still understand big $O$-notation fully and perfectly. If you find that set-theoretic explanation useful, please go ahead anyway.

You can check the section in "asymptotic notation" of CLRS for a more detailed analysis and usage pattern for the family of notations for asymptotic behavior, such as big $\Theta$, $\Omega$, small $o$, small $\omega$, multivariable usage and more. The Wikipedia entry is also a pretty good reference.

Lastly, there is some inherent ambiguity/controversy with big $O$ notation with multiple variables,1 and 2. You might want to think twice when you are using those.

John L.
  • 39,205
  • 4
  • 34
  • 93
10

In The Algorithm Design Manual [1], you can find a paragraph about this issue:

The Big Oh notation [including $O$, $\Omega$ and $\Theta$] provides for a rough notion of equality when comparing functions. It is somewhat jarring to see an expression like $n^2 = O(n^3)$, but its meaning can always be resolved by going back to the definitions in terms of upper and lower bounds. It is perhaps most instructive to read the " = " here as meaning "one of the functions that are". Clearly, $n^2$ is one of functions that are $O(n^3)$.

Strictly speaking (as noted by David Richerby's comment), $\Theta$ gives you a rough notion of equality, $O$ a rough notion of less-than-or-equal-to, and $\Omega$ and rough notion of greater-than-or-equal-to.

Nonetheless, I agree with Vincenzo's answer: you can simply interpret $O(f(n))$ as a set of functions and the = symbol as a set membership symbol $\in$.


[1] Skiena, S. S. The Algorithm Design Manual (Second Edition). Springer (2008)

Mario Cervera
  • 3,804
  • 2
  • 20
  • 22
7

Usually, statements like $$f = O(g)$$ can be interpreted as $$ \text{there exists } h \in O(g) \text{ such that }f = h\,. $$

This becomes more useful in contexts like David Richerby mentions, where we write $f(n) = n^3 + O(n^2)$ to mean "there exists $g(n) \in O(n^2)$ such that $f(n) = n^2 + g(n)$."

I find this existential quantifier interpretation so useful that I am tempted to write things like

$$ f(n) \leq O(n^3) $$

which some will find an even more egregious style violation, but it is just a space-saving way of writing "there exists $C$ such that $f(n) \leq C n^3$."

David Richerby
  • 82,470
  • 26
  • 145
  • 239
usul
  • 4,189
  • 23
  • 30
2

Many other posters have explained that the Big-O can be thought of as denoting a set of functions, and that the notation $n^2 = O(n^3)$ indicates that $n^2$ (as a function of $n$) is in the set denoted by $O(n^3)$ (again considering $n$ as the parameter). In English text, you may prefer to write "$n^2$ is in $O(n^3)$" to avoid confusion.

Although the notation can be confusing, it may help to think of the $O$ and the $=$ as parts of the same notation, that is, to treat $= O$ as if it were one symbol. It is little different from what we do when we write >= in a programming language: two symbols, adjacent, become one in our eyes.

Another tricky aspect of the notation is that the variable acting as the parameter is not explicitly identified, or bound, the way it would be in a function declaration or a lambda notation. This can be especially confusing when there are two variables involved, as in $O(mn)$ or even more so in an expression like $O(n^c)$ since it may be implied that $c$ is a constant. Then again, some algorithms have complexity in a Big-O set that technically varies according to two variables, while in practice one of them is fixed. Still more, there may be multiple reasonable ways of measuring the complexity of one algorithm. For example, if the input is a number, your algorithm might be $O(n)$ in the value $n$ of the number but $O(2^b)$ in the bit size of the number. (Although in complexity theory per se, the bit size is usually the right parameter.)

All this is to say that the Big-O is an informal notation, hounded by imprecision, and you often have to use other context to understand what an author is saying.

As always, you're best to avoid confusion in your own writing, and I suggest avoiding the $= O$ and using instead $\in O$ or the English "…is in…"

ezrakilty
  • 21
  • 1
2

Just to underline the point which has been made several times, allow me to quote from N. G. de Bruijn, Asymptotic Methods in Analysis:

The common interpretation of all these formulas can be expressed as follows. Any expression involving the $O$-symbol is to be considered as a class of functions. If the range $0 < x < \infty$ is considered, then $O(1) + O(x^2)$ denotes the class of all functions of the form $f(x) + g(x)$, with $f(x) = O(1)\,\,(0 < x < \infty)$, $g(x) = O(x^2)\,\,(0 < x < \infty)$. And $x^{-1}O(1) = O(1) + O(x^{-2})$ means that the class $x^{-1}O(1)$ is contained in the class $O(1) + O(x^{-2})$. Sometimes, the left-hand-side of the relation is not a class, but a single function [...]. Then the relation means that the function on the left is a member of the class on the right.

It is obvious that the sign $=$ is really the wrong sign for such relations, because it suggests symmetry, and there is no such symmetry. For example, $O(x) = O(x^2)\,\,(x \rightarrow \infty)$ is correct, but $O(x^2) = O(x)\,\,(x \rightarrow \infty)$ is false. Once this warning has been given, there is, however, not much harm in using the sign $=$, and we shall maintain it, for no other reason than that it is customary.

Donald Knuth also pointed out that mathematicians often use the $=$ sign as they use the word "is" in English. "Aristotle is a man, but a man isn’t necessarily Aristotle."

Having said that, Bender and Orszag's notation (from Advanced mathematical methods for scientists and engineers) is a lot less confusing and is worth considering. With respect to some limit, we say:

$$f(x) \sim g(x)\,\,(x \rightarrow x_0)$$

(pronouncted "$f$ is asymptotic to $g$") means

$$\lim_{x \rightarrow x_0} \frac{f(x)}{g(x)} = 1$$

and:

$$f(x) \ll g(x)\,\,(x \rightarrow x_0)$$

(pronounced "$f$ is negligible compared to $g$") means

$$\lim_{x \rightarrow x_0} \frac{f(x)}{g(x)} = 0$$

But I suppose the benefit of big-oh notation is that the constant factor is arbitrary. (And for little-oh notation, the constant factor is whatever you what.)

Pseudonym
  • 24,523
  • 3
  • 48
  • 99
0

I went over this on stackoverflow; while perhaps the most correct answer to OP's has already been stated above (equivalence classes, restated below as #1), here is a complete answer:

  1. sets: "$f= O(\cdot)$" means $f \in O(\cdot)$, i.e. membership in a set, e.g. $\{\frac{1}{2} x^2, (5 x^2-x+5), 2.5x,...\}$ "the set of functions asymptotically bounded by $x^2$". This is the standard mathematical treatment of asymptotic notation that I am aware of. This set has a partial order corresponding to subset-of, e.g. $O(x^2) < O(x^3) = O(x^3 + x)$ (some sets may be incomparable; DAG; see polynomial hierarchy for an interesting example).

    note. "$f= \Theta(\cdot)$" means $f \in \Theta(\cdot)$. However, note that unlike the above, this is an equivalence relation (obviously the naive relation X as in f X g iff $f \in O(g)$ is not an equivalence class, since $f \in O(g)$ does not imply $g \in f(g)$; the trivial equivalence relation "both an element of $O(g)$" is perhaps amusing to conceptualize but mathematically uninteresting, whereas with $\Theta$ the multiple equivalence classes partition the space of functions).

  2. wildcard expression: You can backwards-engineer the definition of $f \in \Theta(g)$: after some radius of uncaring near the origin (i.e. there exists an $x_0$, such that for all $x>x_0$...), there exists a band bounded constant multiples of $g$ that bounds $f$ (i.e. $LOW*g(x) \le f(x) \le HIGH*g(x)$)... and so we can backwards-engineer this to merely replace any expression $O(g)$ with the expression $k_1g(x)+err(x)$, that is, replace it with the bound itself plus an error term we don't care about (an error term bounded by $0 \le err(x) \le k_2g(x)$ for $x>x_0$ and potentially unbounded for $x\le x_0$)..... so for example if we said $f = 2^{\Theta(x^2)}$, we could equivalently say $f(x)=2^{k_1x^2+err(x)}$ where this error term is $0 \le err(x) \le k_2x^2$. We would never ever write this down though... because it's a bit silly. But I believe it may be legitimate to think of it that way, and would preserve the notion of equality. (I am eliding proper treatment of negative signs here, which may be important.)

    a. Modify the above as appropriate for $O$ instead of $\Theta$

ninjagecko
  • 161
  • 4
0

A more exact answer would be that when we say a function f is 'big O of function g'. (I.e. x^2 + x is O(x^2)) we are saying that f(x) < C*g(x) for some value C and k where x > k. This means that g is an upper bound for the behaiver of f.

Example

x^2 + 10x 4 is O(x^2 + x) which is itself O(x^2)

dadrake
  • 1
  • 2