28

(surprisingly, it appears that this question has not been asked before)

Let $\pi(n)$ denote the number of primes $\leq n$. The prime number theorem states that

$$\pi(n) \sim \frac{n}{\log n} \ \text{as} \ n \to +\infty$$

After painstakingly reading through Erdos's elementary proof of this theorem, I think I understand the mechanics of it from a formal perspective. However, I still don't seem to understand intuitively why this theorem is true. I would like some intuitive insight as to why this theorem holds.

I understand that for a result as deep as this one, even the intuition is going to contain some nitty-gritty details. It's probably not the sort of thing that you could explain to a child, for example. Nevertheless, I will ask this question regardless. There has to be some convincing argument for this theorem beyond the technical details of the proofs.

2 Answers2

29

Not a full explanation, but it is too long for a comment.

Consider the Sieve of Eratosthenes.

Start with the first $n$ numbers. Remove (one less than) $\frac 12$ of them as multiples of $2$, Of the remainder, remove $\frac 13$ of them as multiples of $3$. Of the remainder, remove $\frac 15$ as multiples of $5$, etc. You should be left with about $$n\prod_{p\le n, \text{ prime}}1 - \frac 1p$$

values as primes below $n$. Now, when multiplied out

$$\prod_{p\le n, \text{ prime}}1 - \frac 1p = 1 - \sum_{k \in S_n} \frac 1k$$ Where $S_n$ is the set of all square-free integers $> 1$ whose prime factors are $\le n$.

It remains to estimate that $1 - \sum_{k \in S_n} \frac 1k\approx \frac 1{\log n}$.

Edit:

Since it was already chosen as the solution and Winther has kindly provided Merten's 3rd theorem which says just what was needed, I could just let this go. But Merten's theorem strikes me as hardly more intuitively obvious than the prime number theorem itself, so I've been thinking on heuristic concepts to explain it.

Now for $|x| < 1$, $\frac1{1-x} = 1 + x + x^2 + ...$ Therefore $$\frac 1{\prod\limits_{p\le n}1 - \frac 1p} = \prod\limits_{p\le n}\left(1 + \frac 1p + \frac 1{p^2} + ...\right)$$

Multiplying the right side out, we get $\sum_{k \in R_n} \frac 1k$, where $R_n$ is the set of all integers whose prime factors are all $\le n$. Of these, it should be expected (I'm being heuristic here, so I can get away with that weasel-wording) that the sum for $k > n$ will be significantly smaller than for $k \le n$. Thus it seems reasonable that $$\sum_{k \in R_n} \frac 1k \sim \sum_{k \in R_n, k\le n} \frac 1k = \sum_{k=1}^n \frac 1k \sim \log n$$

Paul Sinclair
  • 45,932
  • 1
    :D you capture everything I think of in a nut-shell. +1 – Simply Beautiful Art Feb 15 '17 at 12:01
  • 2
    By Mertens' 3rd theorem the asymptotic behavior of $\prod_{p\leq n} \left(1 - \frac{1}{p}\right)$ is $\sim \frac{e^{-\gamma}}{\log(n)} \approx \frac{0.56}{\log(n)}$. – Winther Feb 15 '17 at 12:27
  • @Winther: Does it follow from your comment that $\prod_{p \leq n}\bigg({\dfrac{p}{p-1}}\bigg)$ is $\sim \dfrac{\log(n)}{e^{-{\gamma}}} \approx \dfrac{\log(n)}{0.56}$? – Jose Arnaldo Bebita Dris Feb 16 '22 at 14:06
  • 1
    @ARNIEBEBITA-DRIS - That is a finite product, and you've inverted each factor, so yes, that inverts the entire product, which in turn inverts the asymptotic behavior. – Paul Sinclair Feb 16 '22 at 14:28
  • @PaulSinclair But according to the Mertens' 3rd theorem, the "smaller sum" isn't that small, i.e., it does not go asymptotically zero when compared to the product term. Apparently, it is not easy to extract that natural logarithm. – Hulkster Sep 08 '23 at 01:45
  • @Husker - Note that I was not trying to prove anything here, but only aiming to provide some intuition that would suggest the results were reasonable to expect. The actual proofs are of course far beyond the trivial reasoning I gave. – Paul Sinclair Sep 08 '23 at 03:16
  • In the first part of the answer, the formula $$\Pi (1 - 1/p) = 1 - \sum_{S_k} 1/k$$ seems strange to me: shouldn't there be alternating sign on the right? – Peter Franek Sep 26 '24 at 14:30
  • @PeterFranek - not "alternating", but yes, $k$ with an even number of prime factors should have a + sign instead of -. That was 7 years ago, so I no longer recall enough to guess why this was never caught. I'll have to think a bit about it now to determine what exactly I should do to correct it. – Paul Sinclair Sep 26 '24 at 16:15
  • @PaulSinclair Can I have another question please? I still don't get this part: if we remove all multiples of 2 and 3, why is it true that multiples of 5 are approximately one fifth of what remains? Looking at (after removing 2 and 3) "5,7,11,13,17,19,23,25,29,31,35,37,41,43,47,49,....." Ok, I can probably see that x mod 6 is always 5,1,5,1,5,1,... and from 6x+1, evey fifth is divisible and from 6x+5, every fifth is divisible also. Is there an easy generalization? – Peter Franek Sep 26 '24 at 16:58
  • 1
    @PeterFranek - you are overthinking this. Let me remind you of the same thing I reminded Husker last year: I am not proving anything in this post, just providing some intuition. By density, $\frac15$ of all integers are multiples of $5$. By density, $\frac 15$ of the integers removed as multiples of $2$ and $3$ were also multiples of $5$. Thus removing the multiples of $2$ and $3$ does not change the density of multiples of $5$ in the remaining integers. – Paul Sinclair Sep 27 '24 at 01:50
  • I think the question was about idea(s) of Erodes proof, not the formula for primes. The formula was (roughly speaking) anticipated by Gauss 150 years before Erodes proof. And around 1850 Chebyshev published correct estimates as far as order is concerned. – Salcio May 17 '25 at 01:39
  • To remember whether it's $n/\ln n$ or $\ln n/n$ remember that it needs to go off to infinity. – suckling pig May 17 '25 at 07:35
  • @Salcio - "Paul Erdős", nor "Erodes". But the question asks for intuition as to why the prime number theorem (first proved by Riemann) is true, not Erdős's specific proof. This is supported by their acceptance of this answer (given before I added the edit about Merten's theorem). But the reference to Chebyshev providing correct estimates vs my barely coherent ramblings is much appreciated. – Paul Sinclair May 17 '25 at 13:47
  • @PaulSinclair - I am perfectly fine with your comments Paul. I guess, I was influenced with my own experience when reading Erdős "elementary" proof of primer number theorem (many years ago). That is, I could not see any idea behind all those technical manipulations. Lastly, regarding spelling of Erdős name - my computer highlights "your" spelling and suggests mine ... . – Salcio May 17 '25 at 16:04
  • @Salcio - I strongly suggest not allowing your computer to "correct" names.... – Paul Sinclair May 18 '25 at 05:30
5

It's often difficult to provide strong intuitions for these kinds of things, except maybe discussing numerical evidences. It might be worthwhile to discuss how Gauss originally conjectured it. I am posting this as an answer mainly because it's too long for a comment. What follows is taken from lectures by Prof. Andrew Granville for the course Distribution of Prime Numbers.

Here's an excerpt from a letter to Encke from Gauss

In 1792 or 1793 ... I turned my attention to the decreasing frequency of primes ... counting the primes in intervals of length $1000$. I soon recognized that behind all of the fluctuations, this frequency is on average inversely proportional to the [natural] logarithm...

This observation may be best phrased as

About $1$ in $\log x$ of the integers near $x$ are prime.

This suggests that a good approximation to the number of primes up to $x$ is $$\sum_{n=2}^x \frac{1}{\log n} \sim \int_2^x \frac{\text{d}t}{\log t}$$ which is denoted by $\text{Li}(x)$ and is asymptotic to $\frac x{\log x}$.

Now, we look at a comparison of Gauss’ prediction with the actual count of primes up to various values of $x\le 10^{29}$. One can't help but notice the impressive accuracy of the prediction.

By zooming out from the table (or otherwise), one might notice that the third column is approximately half the width of the second columnn, hence suggesting that perhaps $$\left\vert \pi(x)-\text{Li}(x)\right\vert \ll \sqrt{x}$$ for all $x\ge 2$. While this might be over-optimistic (compare with Riemann Hypothesis), the data certainly strongly suggests that $$\frac{\pi(x)}{\text{Li}(x)}\to 1$$ as $x\to \infty$.

Finally, to emphasize on the importance of rigour, one might also notice that the third column is always positive. All the data on primes ever calculated suggests that $\text{Li}(x)$ is a little larger than $\pi(x)$ for each $x\ge 8$. But this is not true and it is known that the first counterexample is seen at around $\sim 10^{316}$.

Gary
  • 36,640
Sayan Dutta
  • 10,345