8

Let $n = \prod_{i = 1}^k p_i^{a_i}$ be the prime factorization of $n$ where each $p_i$ is a distinct prime factors of $n$ and $a_i$ is the highest power of the prime $p_i$ that divides $n$. The sum of the first $n$ natural numbers is $\displaystyle \sum_{n \le x} \prod_{i = 1}^k p_i^{a_i} = \frac{x(x+1)}{2}$. One can replace the inner product with summation and ask for the asymptotics of the non-trivial sum $\displaystyle \sum_{n \le x} \sum_{i = 1}^k p_i^{a_i}$. My experimental data for $x \le 5 \times 10^9$ indicates

$$ \sum_{n \le x} \sum_{i = 1}^k p_i^{a_i} \sim \sum_{n \le x} \sum_{i = 1}^k p_i \sim \frac{cx^2}{\log x} \tag 1 $$

for some constant $c \approx 0.86$. A heuristic justification of the asymptotic equivalence of of the first two sums in $(1)$ comes from the fact that the for most $n$ the largest value of $p_i^{a_i}$ usually occurs when $p_i$ is also the largest prime factor of $n$ as explained in the answer to this question.

Question: What is the asymptotic sum of the prime factors of the first $n$ numbers?

1 Answers1

10

The leading term of the asymptotics is $\frac{\pi^2}{12}\cdot \frac{x^2}{\log x}$. The difference to your empirical estimate $c \approx 0.86$ comes from the lower order terms, mainly the $O\Bigl(\frac{x^2}{\log^2 x}\Bigr)$ term. I shall only prove the leading term, but with more work further terms can be identified.

Writing $p^k \mathrel{\Vert} n$ for $p^k \mid n$, $p^{k+1}\nmid n$, changing the order of summation yields \begin{align} \sum_{n \leqslant x} \sum_{p^k \mathrel{\Vert} n} p^k &= \sum_{p^k \leqslant x} p^k\cdot \varphi(x/p^k,p) \\ &= \sum_{p^k \leqslant x} p^k\biggl(\biggl\lfloor \frac{x}{p^k}\biggr\rfloor - \biggl\lfloor \frac{x}{p^{k+1}}\biggr\rfloor\biggr)\,, \end{align} where $\varphi(y,m)$ is the number of positive integers not exceeding $y$ that are coprime to $m$.

For $2 \leqslant k \leqslant \frac{\log x}{\log 2}$, the contribution to the total is easily seen to be $O(x^{3/2})$ – each term in the sum is $\leqslant x$, and there are only $\sum_{k \geqslant 2} \pi(x^{1/k}) = o(\sqrt{x})$ terms –, so we need only consider $k = 1$.

Now \begin{equation} \sum_{p \leqslant x} p\biggl\lfloor \frac{x}{p^2}\biggr\rfloor \leqslant x\sum_{p \leqslant x} \frac{1}{p} = O(x\log\log x)\,, \end{equation} hence the only relevant bit is \begin{align} \sum_{p \leqslant x} p\biggl\lfloor \frac{x}{p}\biggr\rfloor &= \sum_{p\cdot m \leqslant x} p\cdot 1 \\ &= \sum_{m \leqslant \sqrt{x}} S\biggl(\frac{x}{m}\biggr) + \sum_{p \leqslant \sqrt{x}} p\biggl\lfloor \frac{x}{p}\biggr\rfloor - \lfloor \sqrt{x}\rfloor S(\sqrt{x}) \\ &= \sum_{m \leqslant \sqrt{x}} S\biggl(\frac{x}{m}\biggr) + x\pi(\sqrt{x}) + O\bigl(S(\sqrt{x})\bigr) - \lfloor \sqrt{x}\rfloor S(\sqrt{x}) \\ &= \sum_{m \leqslant \sqrt{x}} S\biggl(\frac{x}{m}\biggr) +O(x^{3/2}) \end{align} where $S(y) = \sum_{p \leqslant y} p$, using the trivial estimates $\pi(y) \leqslant y$ and $S(y) \leqslant y\cdot\pi(y) = O(y^2)$.

For the remaining sum we need \begin{equation} \DeclareMathOperator{\Li}{Li} S(y) = \Li(y^2) + O\biggl(\frac{y^2}{\log^A y}\biggr) \tag{$\ast$} \end{equation} for every $A > 0$ (if we want an arbitrary number of terms; for $k$ terms, any $A > k$ suffices). This follows from $\pi(x) = \Li(x) + O(x/\log^A x)$ by partial summation: \begin{align} \sum_{p \leqslant x} p &= x\pi(x) - \int_2^x \pi(u)\,du \\ &= x\Li(x) - \int_2^x \Li(u)\,du + O\Biggl(\frac{x^2}{\log^A x} + \int_2^x \frac{u}{\log^A u}\,du\Biggr) \\ &= \int_2^x \frac{u}{\log u}\,du + O\biggl(\frac{x^2}{\log^A x}\biggr) \\ &= \int_2^x \frac{2u}{\log (u^2)}\,du + O\biggl(\frac{x^2}{\log^A x}\biggr) \\ &= \Li(x^2) - \Li(4) + O\biggl(\frac{x^2}{\log^A x}\biggr)\,. \end{align} Since $\frac{1}{2}\log x \leqslant \log \frac{x}{m} \leqslant \log x$ for $1 \leqslant m \leqslant \sqrt{x}$, \begin{equation} \sum_{m \leqslant \sqrt{x}} \frac{x^2}{m^2\log^A \frac{x}{m}} \leqslant \frac{2^Ax^2}{\log^A x}\sum_{m \leqslant \sqrt{x}} \frac{1}{m^2} = O\biggl(\frac{x^2}{\log^A x}\biggr)\,. \end{equation} Then using the asymptotic expansion \begin{equation} \Li(y) = \sum_{k = 1}^{\ell} \frac{y\cdot (k-1)!}{\log^{k} y} + O\biggl(\frac{y}{\log^{\ell+1} y}\biggr) \end{equation} with $y = \frac{x^2}{m^2}$ and arbitrary $\ell$ yields \begin{align} \sum_{m \leqslant \sqrt{x}} S\biggl(\frac{x}{m}\biggr) &= \sum_{m \leqslant x} \Biggl(\sum_{k = 1}^{\ell} \frac{x^2(k-1)!}{2^k m^2 \log^k \frac{x}{m}} + O\biggl(\frac{x^2}{m^2\log^{\ell + 1} \frac{x}{m}}\biggr)\Biggr) + O\biggl(\frac{x^2}{\log^A x}\biggr) \\ &= \sum_{k = 1}^{\ell} \frac{x^2(k-1)!}{2^k} \sum_{m\leqslant \sqrt{x}} \frac{1}{m^2\log^k \frac{x}{m}} + O\biggl(\frac{x^2}{\log^{\min \{A,\ell+1\}} x}\biggr)\,. \end{align} Now \begin{equation} \frac{1}{\log^k \frac{x}{m}} = \frac{1}{\log^k x\bigl(1 - \frac{\log m}{\log x}\bigr)^k} = \sum_{\nu = 0}^{r} (-1)^{\nu}\binom{-k}{\nu} \frac{\log^{\nu} m}{\log^{k + \nu} x} + O\biggl(\frac{\log^{r+1}m}{\log^{k+r+1} x}\biggr)\,, \end{equation} $(-1)^{\nu}\binom{-k}{\nu} = \binom{k-1+\nu}{\nu}$ and \begin{equation} \sum_{m \leqslant z} \frac{\log^{\nu} m}{m^2} = (-1)^{\nu}\zeta^{(\nu)}(2) + O\biggl(\frac{\log^{\nu} z}{z}\biggr)\,, \end{equation} whence by choosing $A = \ell + 1$ we obtain \begin{align} \sum_{m \leqslant \sqrt{x}} S\biggl(\frac{x}{m}\biggr) &= \sum_{k = 1}^{\ell} \frac{x^2(k-1)!}{2^k} \Biggl(\sum_{\nu = 0}^{\ell-k} \frac{(k-1+\nu)!}{\nu!(k-1)!\log^{k + \nu} x} \sum_{m \leqslant \sqrt{x}} \frac{\log^{\nu} m}{m^2} + O\bigl(\log^{-\ell-1} x\bigr)\Biggr) \\ &\qquad\qquad + O\biggl(\frac{x^2}{\log^{\ell+1} x}\biggr) \\ &= \sum_{k = 1}^{\ell}\sum_{\nu =0}^{\ell-k} \frac{x^2(k-1+\nu)!}{2^k\nu!\log^{k+\nu} x}\biggl((-1)^{\nu}\zeta^{(\nu)}(2) + O\biggl(\frac{\log^{\nu} \sqrt{x}}{\sqrt{x}}\biggr)\biggr) + O\biggl(\frac{x^2}{\log^{\ell + 1} x}\biggr) \\ &= \sum_{\mu = 1}^{\ell} \frac{x^2}{\log^{\mu} x} \sum_{\nu = 0}^{\mu-1} \frac{(-1)^{\nu}\zeta^{(\nu)}(2)(\mu-1)!}{2^{\mu-\nu}\nu!} + O\biggl(\frac{x^2}{\log^{\ell + 1} x}\biggr)\,. \end{align} Expanding the first few inner sums leads to the asmptotics \begin{align} \frac{\zeta(2)}{2}\cdot \frac{x^2}{\log x} &+ \biggl(\frac{\zeta(2)}{4} - \frac{\zeta'(2)}{2}\biggr) \frac{x^2}{\log^2 x} + \biggl(\frac{\zeta(2)}{4} - \frac{\zeta'(2)}{2} + \frac{\zeta''(2)}{2}\biggr)\frac{x^2}{\log^3 x} \\ &+ \biggl(\frac{3\zeta(2)}{8} - \frac{3\zeta'(2)}{4} + \frac{3\zeta''(2)}{4} - \frac{\zeta'''(2)}{2}\biggr) \frac{x^2}{\log^4 x} \\ &+ \biggl(\frac{3\zeta(2)}{4} - \frac{3\zeta'(2)}{2} + \frac{3\zeta''(2)}{2} - \zeta'''(2) + \frac{\zeta^{(4)}}{2}\biggr)\frac{x^2}{\log^5 x} + O\biggl(\frac{x^2}{\log^6 x}\biggr)\,. \end{align}

Dermot Craddock
  • 2,341
  • 3
  • 15