Explanation of all the incomplete details on Apery's theorem proof following van der Poorten article.

Question

I'm trying to fully understand Apery's proof of the irrationality of $\zeta(3)$ and after looking for a good source I ended up reading van der Poorten article A Proof that Euler missed...

I think that the paper is really well explained but, at same points, it omits too many details (at least for me), so I'm going to state all my doubts the best way I can, and I hope that together with these other posts, if I get good answers, it can be a good source for anyone looking for details related with this paper.

Section 3

Posts with some omited details in secction 3.

Convergence of the sum $\sum_{k=1}^{N} \frac{(-1)^k}{(2k^3) \binom{N+k}{k} \binom {N}{k}}$

Proof that $\sum_{k=1}^\infty\frac{a_1a_2\cdots a_{k-1}}{(x+a_1)\cdots(x+a_k)}=\frac{1}{x}$

Section 4

In section 4 is defined the principal sequence of the proof, that is for $k \leq n$

$$c_{n,k}=\sum_{m=1}^n \dfrac{1}{m^3}+\sum_{m=1}^k \dfrac{(-1)^{m-1}}{2 m^3 {n\choose m}{n+m\choose m}}$$

It can be easily seen using the answer in the firts post I mentioned before that it converges uniformly in $k$ to $\zeta(3)$, that is:

Given $\varepsilon >0$ there exits a $n_0 \in \mathbb{N}$ such that if $n \geq n_0$ then $$\left| c_{n,k} - \zeta(3) \right| \leq \varepsilon \hspace{2mm} \forall k \leq n$$

Now, if we take $k=n$, we have that $c_{n,n}$ converges to $\zeta(3)$ and in fact, we get the series discussed in section 3 but after that, it states that this series does not converge fast enough to proof the irrationality of $\zeta(3)$.

To explain that, he proofs a lema that states that $2[1,2, \cdots, n]^3 {n+k \choose k} c_{n,k}$ is an integer, where $[1,2, \cdots, n]=\mathrm{lcm}(1, 2, \cdots, n)$ . So, we can express for some sequence of integers $z_{n,k}$

$$c_{n,k}=\dfrac{z_{n,k}}{2[1,2, \cdots, n]^3 {n+k \choose k}}$$

It is then stated that given $\varepsilon>0$ for $n$ large enough

$$[1,2, \cdots, n] \leq e^{n(1+\varepsilon)}$$

(which can be rigurously proven using the prime number theorem and the sketch given below the assertion) and from here, he states that this sequence has too large denominator to proof the irrationality.

So, my first doubt:

Doubt 1: How can we explain in detail, based on what is said before, that the series $c_{n,k}$ is not enough to proof the irrationality of $\zeta(3)$? (Solved)

After that, it explain Apéry's process to accelerate the convergence of the series, which consists on applying several transformations to the sequence $c_{n,k}$ until we get two sequences (the second one in the paper has no name, so I name it after $e_{n,k}^{(i)}$):

$$d_{n,k}^{(5)}=\sum_{h=0}^{k}\sum_{l=0}^h {k \choose h} {n \choose h} {h \choose l} {n \choose l} {2n-l \choose n} c_{n,n-l}$$ $$e_{n,k}^{(5)}=\sum_{h=0}^{k}\sum_{l=0}^h {k \choose h} {n \choose h} {h \choose l} {n \choose l} {2n-l \choose n}$$

Now, my doubts here are:

Doubt 2: Why is it true that in this process, we still have that the quotient $d_{n,n}^{(i)}/e_{n,n}^{(i)}$ still converges to $\zeta(3)$? (Solved)

I don't know how to get around when we divide by a sum.

Doubt 3: How do we get that $2[1,2, \cdots, n]^3 d_{n,k}^{(i)}$ is still and integer? (Solved)

Doubt 4: Why does this process, in an intuitive way, accelerate the convergence of the sequence?

Section 5

In this section, it takes $a_n=d_{n,n}^{(5)}$, $b_n=e_{n,n}^{(5)}$, and considering it satisfies the recursion stated at the beggining of the paper, it is proven the irrationality of $\zeta(3)$.

After some manipulations, it is shown that

$$\zeta(3) - \dfrac{a_n}{b_n}= \sum_{k=n+1}^\infty \dfrac{6}{k^3b_kb_{k-1}}$$

so (correct me if I'm wrong in the next reasoning)

$$\zeta(3) - \dfrac{a_n}{b_n} \leq \dfrac{1}{b_n^2}\sum_{k=n+1}^\infty \dfrac{6}{k^3} \leq \dfrac{1}{b_n^2}\sum_{k=1}^\infty \dfrac{6}{k^3}$$

and we get that $b_n=O(b_n^{-2})$.

From here on, I don't really get much on the secction.

Doubt 5: How can we proof, based on the equation stated for $b_n$ that $b_n=O(\alpha^n)$?

Doubt 6: How can we proof, that $q_n=O(\alpha^n e^{3n})$?

I thought these one was because $[1, 2, \cdots, n]=O(e^n)$ but as I was pointed on here thats not true so, I don't know how does he get that relation.

Doubt 7: How do we get these two equalities $\zeta(3) - \frac{p_n}{q_n}=O(\alpha^{-2n})=O(q_n^{-(1+\delta)})$ with $\delta=(\log(\alpha)-3)/(\log(\alpha)+3)$

Sections 6 and 8

In section 6, after defining $a_n$ and $b_n$ it states that it is easy to prove that its quotient converges to $\zeta(3)$, but as I asked in doubt 2, I don't know how to treat the quotient of two sums to get what we desired. I suppose the same answer for doubt 2 will be useful here so I want state it again but, at the beggining of section 8, it shows the relation between this $b_n$ and $e_{n,n}^{(5)}$ but, I don't know why is it true the next equalities:

Doubt 8: $\sum_{k=0}^{n}\sum_{l=0}^k {n \choose k}^2 {n \choose l} {k \choose l} {2n-l \choose n}=\sum_{k=0}^n {n \choose k}^2{2n-k \choose n}^2$

Doubt 9: $\sum_{k=0}^{n}\sum_{l=0}^k {n \choose k}^2 {n \choose l} {k \choose l} {2n-l \choose n} c_{n,n-k}=\sum_{k=0}^n {n \choose k}^2{2n-k \choose n}^2 c_{n,n-k}$

Now, the following doubts are all combinatorial. Maybe I'm lacking of some binomial coefficients propierties, but after pages full of expansion, I didn't get the following equalities.

Given $B_{n,k}=4(2n+1)(k(2k+1)-(2n+1)^2){n \choose k}^2{n+k \choose k}^2$

Doubt 10: $B_{n,k}-B_{n,k-1}=(n+1)^3{n+1 \choose k}^2{n+1+k \choose k}^2-(34n^3+51n^2+27n+5){n \choose k}^2{n+k \choose k}^2+n^3{n-1 \choose k}^2{n-1+k \choose k}^2$

Now, I know how to derive that

$$c_{n,k}-c_{n-1,k}=\dfrac{1}{n^3}+\sum_{m=1}^k \dfrac{(-1)^m (m-1)!^2(n-m-1)!}{(n+m)!} \hspace{2mm}(*)$$

but I don't get the other subsequent equalities although the paper says it is clear.

Doubt 11: $$(*)=\dfrac{1}{n^3}+\sum_{m=1}^k\left( \dfrac{(-1)^m m!^2(n-m-k)!}{n^2(n+m)!}-\dfrac{(-1)^{m-1} (m-1)!}{n^2(n+m+1)!}\right)=\dfrac{(-1)^k k!^2(n-k-1)!}{n^2(n+k)!}$$

After all that definitions, it know defines $$A_{n,k}=B_{n,k}c_{n,k}+\dfrac{5(2n+1)(-1)^{k-1}k}{n(n+1)}{n \choose k}{n+k \choose k}$$ and states that (9) is equal to $A_{n,k}-A_{n,k-1}$.

Doubt 12: How is (9) equal to $A_{n,k}-A_{n,k-1}$?

And, to my disgrace, even assuming the last equality and after three pages full of equlities, I've been unable to prove that $a_n$ satisfies the desired recursion.

Doubt 13: How can we show, using all the previous definitions that $a_n$ satisfies the recurrence relation stated at the beggining of the paper?

I know there are a lot of doubts, but I've been working on this for 3 weeks now and I'm not able to solve any of the questions asked by myself. I think the post can be very useful to anyone interested in this proof because I think all the details I'm trying to proof are not easy for any student who reads the paper, so it would be great if anyone interested can contribute and we can answer all of them and have a really complete post about Apery's proof. I will continue working on it anyway and if I get some answers I'll update the post.

I don't know if this is relevant, but another proof of Apéry's theorem was proposed in this paper: Wadim Zudilin, Apéry's theorem. Thirty years after, Int. J. Math. Comput. Sci. 4 (2009), no. 1 pp 9–19. (arXiv:math/0202159, with the title An elementary proof of Apéry's theorem) — J.-E. Pin, Mar 24 '20 at 10:11
@rgvalenciaalbornoz At the end I managed to get the full proof in full detail but a little detail that it is more or less doubt 5. What I couldn't prove by myself or find anywhere with a "simple" proof is that $b_n \sim A \alpha^n n^{-3/2}$ where $\alpha$ is the biggest root of the polynomial $x^2-34x+1$ and $A$ is some positive constant. That asymptotics are here (http://archive.numdam.org/article/STNG_1977-1978__6__A6_0.pdf) on page 6. — Eparoh, Aug 06 '21 at 11:24
@rgvalenciaalbornoz By the way, by your user name I can see that you are from Spain, so if you would like I can share in private a work where all the details in the proof are well written (in Spanish). — Eparoh, Aug 06 '21 at 11:27
Indeed, that's tricky. The best account (also brief) of that asymptotics that I have seen is Elaydi's book "An Introduction to Difference Equations" page 379. Also, I've answered here how to avoid a bit this calculation to achieve the final result. — rgvalenciaalbornoz, Aug 08 '21 at 00:00

Thomas Bloom · Answer 1 · 2020-03-18T16:55:00.987

There are many questions here. I'll answer the first for now, and then hopefully return later to fill in some more of the gaps. This addresses Doubt 1. (EDIT: Also addressed Doubts 2 and 3.)

We define a sequence

$$ c_{n,k} = \sum_{m=1}^n\frac{1}{m^3} + \sum_{m=1}^k \frac{(-1)^{m-1}}{2m^3\binom{n}{m}\binom{n+m}{m}}.$$

As you say, $c_{n,n}\to \zeta(3)$ as $n\to\infty$, but this convergence is not fast enough to show that $\zeta(3)$ is irrational. Here van der Poorten has in mind the criterion for irrationality introduced at the beginning of the article:

If there is $\delta>0$ and a sequence $p_n/q_n$ of rational numbers such that $$ 0< \lvert \beta-\frac{p_n}{q_n}\rvert <\frac{1}{q_n^{1+\delta}}$$ then $\beta$ is irrational.

So let's say we want to show that $\zeta(3)$ is irrational using this criterion. We need to find some sequence $p_n/q_n$ which approximates $\zeta(3)$. Let $\epsilon_n = \zeta(3)-p_n/q_n$. To be able to apply this criterion we need to know that $\epsilon_n$ is small and $q_n$ is small also. More precisely, we need to have the estimate $\lvert \epsilon_n\rvert< q_n^{-1-\delta}$ for some absolute constant $\delta>0$.

First let's see why the trivial sequence $p_n/q_n=\sum_{m=1}^n \frac{1}{m^3}$ doesn't work. Obviously this converges to $\zeta(3)$. But $\epsilon_n \asymp \frac{1}{n^2}$, and the denominator $q_n$ grows like $[1,\ldots,n]^3\approx e^{3n}$. And there is certainly no constant $\delta>0$ such that $n^{-2} < (e^{3n})^{-1-\delta}$ is true for all large $n$.

So instead we use the sequence $c_{n,n}$. As you say, van der Poorten proves the estimate $$ q_n \ll [1,2,\ldots,n]^3\binom{2n}{n} \approx e^{3n}4^n.$$

The denominators here are actually larger than those for the trivial sequence above. But the advantage is that the speed of convergence is much faster now.

As noted on page 201 we have $$ c_{n,k}-c_{n-1,k} = \frac{(-1)^k(k!)^2(n-k-1)!}{n^2(n+k)!}$$ and trivially $$ c_{n,k} - c_{n,k-1} = \frac{(-1)^{k-1}(k!)^2(n-k)!}{2k^3(n+k)!}.$$

Therefore

$$ c_{n,n}-c_{n-1,n-1} = (c_{n,n}-c_{n,n-1})+(c_{n,n-1}-c_{n-1,n-1}) = \frac{(-1)^n}{\binom{2n}{n}}\left( \frac{1}{n^2}-\frac{1}{2n^3}\right)=\frac{(-1)^n}{2n^3\binom{2n}{n}}(2n-1)$$

and the same identity holds for $\epsilon_n-\epsilon_{n-1}$. It follows by induction (since $\epsilon_n\to 0$ and $\epsilon_0=\zeta(3)$) that

$$ \epsilon_n = \zeta(3) - c_{n,n} = \sum_{k=n+1}^\infty\frac{(-1)^k(2k-1)}{2k^3\binom{2k}{k}}.$$

The denominators here grow rapidly - indeed, $\binom{2n}{n}\asymp 4^n/n^{1/2}$. It follows that $\lvert \epsilon_n\rvert \ll 4^{-n}$.

Thus the speed of convergence is much faster, and is now exponential in $n$, not polynomial in $n$, so we are much closer to being able to prove irrationality. The sequence $c_{n,n}$ does not converge fast enough, however, since $4 < e^34$, so we fail to have $\lvert \epsilon_n\rvert < q_n^{-1-\delta}$ for some $\delta>0$ as required.

Thus the need to 'amplify' the convergence. The key part of Apery's proof is to speed up the convergence without increasing the denominators too much. With the refined sequence Apery uses (as explained by van der Poorten) we now have $q_n\approx (e^3\alpha)^n$ and $\epsilon_n \approx \alpha^{-2n}$, where $\alpha=(1+\sqrt{2})^4$. This is sufficient to work for irrationality since now $\alpha^2 > e^3\alpha$.

EDIT:

Doubt 2 concerns the behaviour of the accelerated series. Let $d_{n,k}^{(0)} = c_{n,k}\binom{n+k}{k}$ and $e_{n,k}^{(0)}=\binom{n+k}{k}$. Then it is true that $d_{n,k}^{(0)}/e_{n,k}^{(0)}\to \zeta(3)$ as $n\to \infty$ uniformly in $k$, for all $k\leq n$. The accelerated version is built up in stages:

$$ d_{n,k}^{(1)} = d^{(0)}_{n,n-k}$$ $$ d_{n,k}^{(2)} = \binom{n}{k}d^{(1)}_{n,k}$$ $$ d_{n,k}^{(3)} = \sum_{l=0}^k \binom{k}{l}d_{n,l}^{(2)} $$ $$ d_{n,k}^{(4)} = \binom{n}{k}d^{(3)}_{n,k} $$ $$ d_{n,k}^{(5)} = \sum_{l=0}^k \binom{k}{l}d_{n,l}^{(4)},$$

and similarly for $e_{n,k}$. Note that there's only three distinct transformations involved here. I claim that each transformation preserves the property that $d_{n,k}/e_{n,k}\to \zeta(3)$ as $n\to \infty$, uniformly in $k$, for all $k\leq n$. The first such transformation is clear since the convergence is uniform in $k$. The second transformation is also clear since its just a scalar multiple of each sequence. The third preserves the limit by uniformity.

Indeed, suppose that $d_{n,k}/e_{n,k}=\zeta(3)(1+ \delta_{n,k})$ where $\lvert \delta_{n,k}\rvert \leq \epsilon_n$ for all $k\leq n$ and $\epsilon_n\to 0$ as $n\to \infty$. Then applying the transformation with the sum yields the new ratio

$$ \frac{ \sum_{l=0}^k \binom{k}{l}d_{n,l} }{\sum_{l=0}^k\binom{k}{l}e_{n,l}} = \zeta(3)\left(1+\frac{ \sum_{l=0}^k \binom{k}{l}\delta_{n,l}e_{n,l}}{\sum_{l=0}^k \binom{k}{l}e_{n,l}}\right).$$ The second term inside the brackets is at most $\epsilon_n$ (note that the sequence $e_{n,k}$ remains positive) and we have uniform convergence as before.

Doubt 3 is also easily dispatched at this point. van der Poorten proves that $2[1,\ldots,n]^3d_{n,k}^{(0)}\in\mathbb{Z}$ for all $k\leq n$. Since each transformation is a linear transformation with integer coefficients this property remains true in each transformation (note that we never try to change the $n$ subscript, only the behaviour in terms of $k$).

Unfortunately I don't have a good answer to Doubt 4 myself! I think, to me, this is part of the magic and ingenuity behind Apery's proof, as expressed by van der Poorten. We have seen that the denominators don't really change under these transformations. The fact that the speed of convergence has jumped so dramatically is essentially because the (diagonal terms of the) transformed sequences obey the remarkable recurrence which is (3) in van der Poorten's article, the key point being that the recurrence is only of `polynomial' complexity, moreover only of 'degree' 3.

It is the simplicity of this recurrence which leads to such rapid convergence (see the left-hand column of page 199). Note that the sequence of $c_{n,n}$ does not satisfy any such simple `polynomial' recurrence. It is really incredible that the transformed diagonal sequence $d_{n,n}^{(5)}$ does, as I think van der Poorten manages to convey quite well in this article.

Thanks for such a detailed answer! I think I've understood almost everything you have posted but I'm not sure about some notation. What does the simbols $ \asymp$ and $\ll$ exactly mean? And, what do you mean exactly by $\approx$? It would be great if you could also answer some of my other doubts in such a great way. — Eparoh, Mar 18 '20 at 13:55
The $\asymp$ and $\ll$ are sometimes known as Vinogradov's notation, where $f\ll g$ means the same as $f=O(g)$, and $f\asymp g$ means $f\ll g\ll f$. The use of $\approx$ is deliberately vague, since this only a heuristic sketch - I suppose in this context $\approx C^n$ means $=C^{(1+o(1))n}$, for example. — Thomas Bloom, Mar 18 '20 at 14:00
Also thanks Gerry Myerson for correcting the name, don't know what I was thinking! — Thomas Bloom, Mar 18 '20 at 14:01
Thanks again, doubt 2 and 3 are fully solved for me now. For doubt 1 I still have some questions. Here, $q_n \ll [1,2,\ldots,n]^3\binom{2n}{n} \approx e^{3n}4^n$, how do we know that $q_n \ll [1,2,\ldots,n]^3\binom{2n}{n}$? I mean, I know that $q_n$ is a divisor of $[1,2,\ldots,n]^3\binom{2n}{n}$ for every $n$ but, how do we get that, for some constant $C>0$ and large enough $n$ is $q_n \leq C [1,2,\ldots,n]^3\binom{2n}{n}$. And, why is $\binom{2n}{n} \approx 4^n$ if $\binom{2n}{n} / 4^n$ goes to zero? Lastly, how do we prove $\binom{2n}{n}\asymp 4^n/n^{1/2}$? — Eparoh, Mar 18 '20 at 16:47
And, in this expression $\epsilon_n = \zeta(3) - c_{n,n} = \sum_{k=n+1}^\infty\frac{(-1)^n(2n-1)}{2n^3\binom{2n}{n}}$ I think that we have to change $n$ by $k$ in every term inside the series. Also, I don't get quite well how do you conclude that $\lvert \epsilon_n\rvert \ll 4^{-n}$ from that expression. As you can see my greatest weakness undestanding the proof is all the asymptotic behaviour. If you know of some good book or lectures on this topics I would appreciate it very much. — Eparoh, Mar 18 '20 at 16:51
Since $q_n$ is a divisor of $[1,\ldots,n]^3\binom{2n}{n}$ it is trivial that $q_n\leq [1,\ldots,n]^3\binom{2n}{n}$. As I said, in this context $\approx$ ignores polynomial factors, so $\binom{2n}{n}\approx 4^n/n^{1/2}\approx 4^n$. The fact that $\binom{2n}{n}\asymp 4^n/n^{1/2}$ is a standard binomial coefficient estimate, and follows from Stirling's formula, for example. — Thomas Bloom, Mar 18 '20 at 16:52
Yes, the $n$ should be $k$ in the series, I'll fix that. The $\ll 4^{-n}$ is because we can estimate the series, using $\binom{2k}{k} \gg 4^k/k^{1/2}$, by $\ll \sum_{k>n}4^{-k}\ll 4^{-n}$. — Thomas Bloom, Mar 18 '20 at 16:54
I still don't get one thing for $c_{n,n}$. If $q_n \ll e^{3n}4^n$ and $\lvert \epsilon_n\rvert \ll 4^{-n}$ then we have that for $n$ big enough it is $1/q_n^{1+\delta} \geq C \frac{1}{(e^{3n}4^n)^{1+\delta}}$ and $\lvert \epsilon_n\rvert \leq K 4^{-n}$, so we cannot end up with $\lvert \epsilon_n\rvert \geq \frac{1}{q_n^{1+\delta}}$ from that inequalities. How can we do that? — Eparoh, Mar 23 '20 at 09:54
Right, technically, to show definitely that it won't work you actually need to the inequalities the other way round, and show that $q_n \gg e^{3n}4^n$ and $\epsilon_n \gg 4^{-n}$. I believe that this is possible, but it's not obvious from the above sketch. The point of the sketch (and van der Poorten's comments) is not really to prove definitively that the sequence converges too slowly to prove irrationality, but rather to show that the the type of estimates one gets are not enough to prove irrationality. — Thomas Bloom, Mar 23 '20 at 13:24
I've studied it carefully and I think you made a mistake in the calculation of $c_{n,n}-c_{n-1,n-1}$. $c_{n,n}$ is in fact the partial sums of the series (1) at the beggining of the article. Maybe with this easier series it is possible to bound it below with the $4^{-n}$. — Eparoh, Mar 24 '20 at 11:16
It's certainly possible I made a mistake in my manipulations somewhere, but even if $c_{n,n}$ can in fact be shown to be equal to the partial sum of (1) (it is not by definition), it doesn't change the growth of the series. Possibly this makes it easier to bound below, but still not trivial, I think. Nonetheless, a proof of the lower bound $4^{-n}$ is not necessary for this heuristic, since it's only meant to be a rough sketch of why the 'easier' approach fails to prove irrationality. — Thomas Bloom, Mar 24 '20 at 11:20

Explanation of all the incomplete details on Apery's theorem proof following van der Poorten article.

1 Answers1