6

For $n = 19$, the average $\frac{(2 + 3 + 5 + 7 + 11 + 13 + 17 + 19)}{ 8} = \frac{77}{8}$ is greater than $\frac{19}{2}$ but for $n > 19$, it seems that the average is smaller than $n/2$.

I confirmed it holds for all $n < 10^7$ but have not been able to prove it for all n. How can I prove this?

TShiong
  • 1,270

2 Answers2

5

The answer is yes for sufficiently large $n$. Sketch of proof:

  • Using summation by parts (as in either one of these two existing answers), the average in question is $$ \frac1{\pi(n)}\sum_{p\le n} p = \frac1{\pi(n)} \biggl( n\pi(n) - \sum_{k<n} \pi(k) \biggr) = n - \frac1{\pi(n)} \sum_{k<n} \pi(k). $$
  • Instead of just using $\pi(x) = \dfrac x{\log x} + O\biggl( \dfrac x{\log^2 x} \biggr)$, we use the more precise $$ \pi(x) = \dfrac x{\log x} + \dfrac x{\log^2 x} +O\biggl( \dfrac x{\log^3 x} \biggr) $$ that follows from the asymptotic expansion of the logarithmic integral.
  • If we write $\displaystyle \pi(n) = \dfrac n{\log n} \biggl( 1 + \dfrac 1{\log n} + O\biggl( \dfrac1{\log^2 n} \biggr) \biggr) $ and use $(1+\varepsilon)^{-1} = 1-\varepsilon+O(\varepsilon^2)$, we see that $$ \frac1{\pi(n)} = \dfrac{\log n}n \biggl( 1 - \dfrac 1{\log n} + O\biggl( \dfrac1{\log^2 n} \biggr) \biggr). $$
  • Since the function $\dfrac x{\log^\alpha x}$ is increasing for $x>e^\alpha$, it's easy to see that $\displaystyle \sum_{k<n} \frac k{\log^\alpha k} = \int_2^n \frac t{\log^\alpha t}\,dt + O(n)$.
  • Integration by parts yields $\displaystyle \int_2^n \frac t{\log^\alpha t}\,dt = \frac{n^2}{2\log^\alpha n} + O(1) + \frac{\alpha}2 \int_2^n \frac t{\log^{\alpha+1} t}\,dt$. It follows that $\displaystyle \int_2^n \frac t{\log^2 t}\,dt = \frac{n^2}{2\log^2 n} + O\biggl( \frac{n^2}{\log^3 n} \biggr) $ and $\displaystyle\int_2^n \frac t{\log t}\,dt = \frac{n^2}{2\log n} + \frac{n^2}{4\log^2 n} + O\biggl( \frac{n^2}{\log^3 n} \biggr) $.

Putting these all together: \begin{align*} \frac1{\pi(n)} \sum_{p\le n} p &= n - \frac1{\pi(n)} \sum_{k<n} \biggl( \dfrac k{\log k} + \dfrac k{\log^2 k} + O\biggl( \dfrac k{\log^3 k} \biggr) \biggr) \\ &= n - \frac1{\pi(n)} \biggl( \int_2^n \frac t{\log t}\,dt + \int_2^n \frac t{\log^2 t}\,dt + O\biggl( \int_2^n \frac t{\log^3 t}\,dt + n\biggr) \biggr) \\ &= n - \dfrac{\log n}n \biggl( 1 - \dfrac 1{\log n} + O\biggl( \dfrac1{\log^2 n} \biggr) \biggr) \\ &\qquad{}\times\biggl( \biggl( \frac{n^2}{2\log n} + \frac{n^2}{4\log^2 n} \biggr) + \frac{n^2}{2\log^2 n} + O\biggl( \frac{n^2}{\log^3 n} \biggr) \biggr) \\ &= n - \biggl( \frac n2 + \frac n{4\log n} + O\biggl( \frac n{\log^2n} \biggr) \biggr) = \frac n2 - \frac n{4\log n} + O\biggl( \frac n{\log^2n} \biggr), \end{align*} which finishes the proof.

Greg Martin
  • 92,241
  • 1
  • hey uh... when I tried this problem previously, someone told me you can't use asymptotic estimates as equations... why not? I'm not really good at math (high school student) but it seems like you're doing just that...? can you tell me why it's okay in this particular instance? –  Feb 27 '22 at 03:33
  • 3
    There are algebraic rules for manipulating $O$-terms, similar to those we learned for manipulating true equalities. With experience one learns which algebraic manipulations are valid and which are pitfalls. – Greg Martin Feb 27 '22 at 07:04
  • can you give me a link on where i can learn this? it seems really cool!! c: –  Feb 28 '22 at 16:01
  • 1
    I learned these techniques in general from Montgomery and Vaughan's book Multiplicative Number Theory I. – Greg Martin Feb 28 '22 at 18:16
2

Mandl's inequality (see for example this paper of Axler) states that $$ \frac{1}{m}\sum\limits_{k = 1}^m {p_k } < \frac{{p_m }}{2} $$ for $m\geq 9$. If $n\geq 23$, then $\pi(n)\geq 9$. Thus, $$ \frac{1}{{\pi (n)}}\sum\limits_{p \le n} p = \frac{1}{{\pi (n)}}\sum\limits_{k = 1}^{\pi (n)} {p_k } < \frac{{p_{\pi (n)} }}{2} \le \frac{n}{2}. $$ For $n=20,21$ and $22$, the inequality can be checked by hand.

Gary
  • 36,640