2

I am looking for an interpretation of Stirling's formula using information theory.

Suppose you have $n$ different objects and $n$ different labels. There are $n!$ different ways of assigning the labels to the objects, provided that no two different objects get assigned the same label. The corresponding entropy is thus $\log_2(n!)$.

This entropy is less than or equal to the sum of the entropies of each object. Fixing an object, there are $n$ possible labels, each occurring with equal probability. So each object has entropy $\log_2(n)$.

We thus get $\log_2(n!) \leq n\log_2(n)$.

My question is how to get the next term in Stirling's formula ($-n\log_2(e)$) at least intuitively, and using information theory, when $n$ is large?

Malkoun
  • 5,594

1 Answers1

2

Just for fun: it is enough to exploit the prime number theorem $\pi(n)\sim\frac{n}{\log n}$. Legendre's theorem on $\nu_p(n!)$ ensures

$$ n! = \prod_{p\leq n} p^{\lfloor\frac{n}{p}\rfloor+\lfloor\frac{n}{p^2}\rfloor+\ldots}\leq \prod_{p\leq n}p^{\frac{n}{p-1}} $$ which is a rather crude bound, but already sufficient:

$$ \log(n!) \leq n\sum_{p\leq n}\frac{\log(p)}{p-1}=n\sum_{m\leq n}\mathbb{1}_p(m)\frac{\log m}{m-1}\stackrel{\text{SBP}}{=}\underbrace{n\pi(n)\frac{\log n}{n-1}}_{O(n)}+n\sum_{m\leq n-1}\pi(m)\left[\frac{\log m}{m-1}-\frac{\log(m+1)}{m}\right].$$ Here $\mathbb{1}_p$ is the characteristic function of primes, $\pi$ is the prime-counting function and $\text{SBP}$ stands for summation by parts. In the last sum the main term behaves like $\frac{1}{m}+O\left(\frac{\log m}{m^2}\right)$, hence $$\log(n!) = O(n)+n\sum_{m\leq n-1}\frac{1}{m} =n\log(n)+O(n).$$ Of course this is extreme cheating: historically speaking, weak forms of the PNT were derived from the asymptotics for $n!$ or $\binom{2n}{n}$, while here we performed just the opposite. A more elementary approach is to consider that $$ \frac{4^n}{\sqrt{n+1}}<\binom{2n}{n}=\frac{(2n)!}{n!^2}<4^n $$ where the inequality on the right is trivial and the inequality on the left follows from $\binom{2n}{n}=\sum_{k=0}^{n}\binom{n}{k}^2$ and Cauchy-Schwarz. By the theory of moments, the previous inequality holds for non-integer values of $n$, too, so by letting $L(n)=\log(n!)$ we have

$$ L(n)-2 L(n/2) < n\log(2), $$ $$ 2L(n/2)- 4L(n/4) < n\log(2), $$ $$ 4L(n/4)- 8 L(n/8) < n\log(2), $$ $$ \ldots $$

and

$$ L(n) - 2^k L\left(\frac{n}{2^k}\right) \leq kn\log(2). $$ By picking $k\approx \log_2(n)=\frac{\log(n)}{\log(2)}$ the claim $L(n)\sim n\log n$ is proved.

Jack D'Aurizio
  • 361,689
  • 1
    Thank you. I shall read carefully your answer soon. It would be nice to absorb it. But I still would like a more information-theoretic approach. Your answer is very informative though (no pun intended -- I just noticed the pun)! – Malkoun Sep 29 '19 at 17:06
  • @Malkoun: the main argument here is that asymptotics for $n!$ can be derived from asymptotics for $\binom{2n}{n}$. Since $$ \frac{1}{4^n}\binom{2n}{n}=\frac{2}{\pi}\int_{0}^{\pi/2}(\cos\theta)^{2n},d\theta$$ and the RHS is a moment, to prove $\binom{2n}{n}\sim\frac{4^n}{\sqrt{\pi\left(n+\frac{1}{4}\right)}}$ is fairly simple. A behaviour of the $\frac{4^n}{\sqrt{\pi n}}$ kind is also justified by the central limit theorem. – Jack D'Aurizio Sep 29 '19 at 17:17
  • Here there are elementary but very accurate approximations for $\binom{2n}{n}$: https://math.stackexchange.com/a/2549434/44121 – Jack D'Aurizio Sep 29 '19 at 17:21
  • I read the first proof. That was nice. Thank you. In particular, I have learned about Legendre's theorem. – Malkoun Oct 03 '19 at 18:13