How to approximate an expression containing very large numbers?

Question

Ok so I have today found an expression that wolfram alpha breaks on. I think so anyhow.

I would like to evaluate (very roughly)

$$1-(\frac{1}{2^{1021}})^{6*10^{10}}\frac{2^{1021}!}{(2^{1021}-6*10^{10})!}$$

Wolfram Alpha answers that this expression is $1$ which is wrong and is the first expressions I've seen it stumble on (quite badly too).

The reason why it's wrong is that unless I made a mistake somewhere this is (roughly) the chance of getting a collision when choosing $6*10^{10}$ primes of length $2^{1024}$ uniformly at random. Since you should get a probability of one half after choosing roughly $2^{510}$ primes you should be getting something very close to zero after choosing "only" 60 billion. I have a few ideas how to approximate the result but the factorials are giving me trouble.

Any ideas on how to do this will be welcomed. Obviously any answer that can bound the number within a couple orders of magnitude would be great.

As an aside: how do you know that this is the first expression that Wolfram Alpha has miscalculated for you? — Rob Arthan, Aug 10 '16 at 20:26
@RobArthan Well to be perfectly honest I don't, I don't tend to check every digit when I compute things. On the other hand I do tend to sanity check answers so I'm fairly confident it is the first one. — DRF, Aug 10 '16 at 20:30
Thanks for the info. It's interesting to hear opinions about the quality of results from computer algebra systems. — Rob Arthan, Aug 10 '16 at 20:55
You could write a Java program to calculate this for you using the BigDecimal class. — kamoroso94, Aug 10 '16 at 21:19
@kamoroso94 I'm not absolutely sure, but I would bet a 100$ that without some serious optimization I would end up with the same answer Wolfram Alpha got, I mean it's not really written by idiots. — DRF, Aug 10 '16 at 21:25
@DRF Yeah, but Wolfram has many people making requests to their servers and you're communicating over a network, with a limit on the CPU time for your calculation. If you do it yourself, you can wait as long as you like without dealing with those issues. — kamoroso94, Aug 10 '16 at 21:27
@kamoroso94 Hmm fair enough. It might be pertinent I'm actually a subscriber so while the time is still limited it's not super low. Also Wolfram gets that result quite quickly. I don't know what their rounding/cutoff place is but if you look at the numbers you're going to be getting they will be super annoyingly big. No matter how I order the computation off the top of my head I'm seeing roughly 60gigs of space for the numbers. — DRF, Aug 10 '16 at 21:30
60GB of space for the numbers? $2^1021!$ would use $\frac{1021×1022}{2}=521731$ bits or about $65$KB. — kamoroso94, Aug 10 '16 at 21:45
FYI, I got WolframAlpha to compute something that resulted in essentially $^{45}10;$ ($10$ tetrated $45)$ with this, although the computation was probably very trivial since each additional factorial essentially exponentiates what you had before. — Dave L. Renfro, Aug 10 '16 at 21:48
@kamoroso94 $2^{1021}!$ multiplies $2^{1021}$ factors together which on average each have a length of 1021 bits. so it's not $\frac{1021\times 1022}{2}$ but rather $\frac{2^{1021}\times 1021}{2}$ — DRF, Mar 10 '17 at 06:46
@RobArthan: Check out this amusing example of how and why WA gets limits wrong. The reason is that WA is basically a product of programmers who do not really know the mathematics behind algorithms and heuristics that they implement, so they always get it wrong. — user21820, Mar 10 '17 at 10:48

score 3 · Accepted Answer · answered Aug 11 '16 at 00:12

Using $n! \approx \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$, $\ln n! \approx n\ln(n) - n + \tfrac12\ln(2\pi n)$ and $\ln(1+x) \approx x-\tfrac12x^2$ for small $x$, and some liberal use of these and other approximations, we might get something like

$\dfrac{m!}{(m-k)!} $ $\approx \exp\left(m\ln(m) - m + \tfrac12\ln(2\pi m) - (m-k)\ln(m-k) + (m-k) - \tfrac12\ln(2\pi (m-k)) \right)$ $=\exp\left(k\ln(m)+\left(m-k+\tfrac12\right)\ln\left(1+\tfrac{k}{m-k} \right) -k\right)$ $\approx m^k\exp\left(\left(m-k+\tfrac12\right)\left(\tfrac{k}{m-k}-\tfrac12\left(\tfrac{k}{m-k}\right)^2 \right) -k \right) \qquad$ $= m^k\exp\left( -\tfrac12 \tfrac{k^2}{m-k} +\tfrac12 \tfrac{k}{m-k}-\tfrac14\tfrac{k^2}{(m-k)^2} \right) \approx m^k\exp\left( -\tfrac12 \tfrac{k^2}{m-k} \right)$

so $1-m^{-k}\tfrac{m!}{(m-k)!} \approx \tfrac12 \tfrac{k^2}{m-k} $ when $k^2 \ll m$, in which case we may as well use $$1-m^{-k}\dfrac{m!}{(m-k)!} \approx \dfrac{k^2}{2m}.$$

As an example with $m=10^6$ and $k=200$, we would have $\tfrac{k^2}{2m} =0.02$ while in fact $1-m^{-k}\tfrac{m!}{(m-k)!} \approx 0.0197046$.

In your case with setting $m=2^{1021}$ and $k=6\times 10^{10}$, we would have $\tfrac{k^2}{2m} \approx 8 \times 10^{-287}$ which, as you suspected, is close to $0$.

r.e.s. · Answer 2 · 2023-05-05T17:35:23.423

2

We want $1-f(n,m)$, where $f(n,m) = \frac{n!}{(n-m)!} n^{-m}$, for $n = 2^{1021},\ \ m=6\cdot10^{10}$.

Using SageMath to first compute $\log(f(n,m))$:

R = RealField(prec=2000)  # 2000-bit floating-point precision
def log_factorial(n):
    x = n + 1
    return (x - 1/2)*log(x) - x + log(2*pi)/2 + 1/(12*x)
def log_f(n,m): return R(log_factorial(n) - log_factorial(n-m) -m*log(n))
n = 2^1021
m = 6*10^10
print(f"    log_f(n,m) = {log_f(n,m):+.6e}")
print(f"1-e^log_f(n,m) = {1-e^log_f(n,m):+.6e}")

with output

    log_f(n,m) = -8.010266e-287
1-e^log_f(n,m) = +8.010266e-287

This is (to seven significant figures) the same numerical value obtained in the accepted answer.

edited May 05 '23 at 17:35

answered Aug 10 '16 at 21:23

r.e.s.

15,537

So... very very very close to one? – Pedro Aug 10 '16 at 21:35
I like the try and +1 to you but I feel we should be getting a bound in at least double digits of 0's beyond the decimal point. Still good try. – DRF Aug 10 '16 at 21:41
@PedroTamaroff - No, the final line of output is $\quad 1-\exp(\log(f(n,m)))=1-f(n,m)\approx 0$. – r.e.s. Aug 10 '16 at 21:43
@DRF - I've edited to show the results of increasing the precision (to 200 bits) for the real number representations in Sage. I think the number of terms used in the computation of the log-factorial function is adequate for this. – r.e.s. Aug 10 '16 at 21:53

How to approximate an expression containing very large numbers?

2 Answers2

Linked