0

I was investigating Fibonacci numbers and looking for the first occurrence of the substring "787" within the sequence. I found that the 49th Fibonacci number, 7778742049, is the first to contain "787".

This led me to wonder:

Is it possible to determine the probability that a randomly chosen Fibonacci number contains the substring "787"? If so, what would be the appropriate probabilistic approach to estimate or calculate this probability?

Any insights or references would be appreciated!

  • First of all, you can't talk about (uniformly chosen) random elements of a countable set. You can only talk about density of a subset in an order, which is the limit of a probability. The probability would be "I select a random integer $K$ from $1$ to $n.$ Determine the probability, $p_n,$ that $F_K$ has a given digit pattern." Then you compute the limit of $p_n$ as $n\to\infty.$ It is common to interpret that limit as a probability, but there are problems with that interpretation. – Thomas Andrews Feb 27 '25 at 17:52
  • Second, digit patterns really have little to do with a number's algebraic properties, because base $10$ notation is an arbitrary choice of notations, not just of the base, $10,$ but of the choice to use a base notation at all. So it is likely this is the same density as the density of all natural numbers, so the density of $j$ digits being in a number would be $1/10^j.$ There might be a reason for otherwise: Maybe due to the relationship between Fibonacci and $\sqrt5$ and the importance of $5$ in base $10,$ but it seems unlikely. – Thomas Andrews Feb 27 '25 at 17:58
  • Third: It is, however, very trick to show such things, in general. I strongly doubt if it is known. – Thomas Andrews Feb 27 '25 at 18:00
  • Whoops, I'm wrong about the density in all natural numbers. The density in all integers is $1,$ not $1/10^j.$ For example, in a random $1,000,000$ digit natural number, the probability is very high that you will get every $3$-digit sequence. – Thomas Andrews Feb 27 '25 at 18:06
  • Quite different from your question, but related. Just the standard fact that any sequence of numbers appears at the beginning of some Fibonacci number; not addressing the distribution of the first occurrence at all. – Jyrki Lahtonen Feb 27 '25 at 18:39

1 Answers1

1

Formalizing what is meant by "a random Fibonacci number" is difficult, but one can ask for a generalization of your computation: can we give an estimate on the smallest $n$ such that $F_n$ contains a given sequence of digits?

Heuristically (so I will not prove anything) we can argue as follows. Let's just assume that the digits of $F_n$ are random, with no predictable pattern (there will be some patterns in the last few digits but let's ignore this). This is very similar to the heuristic behind the use of pseudorandom number generators such as the linear congruential generator or linear-feedback shift register.

More precisely we will adopt a heuristic probabilistic model that the digits of $F_n$ consist of $\ell_n = \lfloor \log_{10} F_n \rfloor + 1$ digits chosen uniformly at random from the digits $\{ 0, \dots 9 \}$. This means the expected number of occurrences of any given sequence $w = d_1 \dots d_k$ of $k$ digits in $F_n$ is

$$\frac{\ell_n - k + 1}{10^k} \approx \frac{\ell_n}{10^k}$$

meaning the expected number of occurrences of any given sequence of $k$ digits in the numbers $F_1, \dots F_n$ is approximately

$$\sum_{i=1}^n \frac{\ell_i}{10^k} \approx \frac{1}{10^k} \sum_{i=1}^n \log_{10} F_i$$

which we can estimate as follows. By Binet's formula $F_n$ grows asymptotically like $\varphi^n$ where $\varphi = 1.618 \dots$ is the golden ratio; this gives

$$\log_{10} F_n \approx n \log_{10} \varphi = 0.209 \dots n$$

and substituting this gives that the expected number of occurrences of a sequence of $k$ digits from $F_1, \dots F_n$ is approximately (we've dropped many error terms here but this is just to give the rough idea, and this is the correct dominant term):

$$\frac{1}{10^k} \sum_{i=1}^n i \log_{10} \varphi \approx \boxed{ \frac{\log_{10} \varphi}{10^k} \frac{n^2}{2} }.$$

Now we can estimate the first appearance of our sequence of $k$ digits $w$ by setting this expected value equal to $1$. This gives

$$\boxed{ n \approx \sqrt{ \frac{2 \cdot 10^k}{\log_{10} \varphi} } }.$$

To check this numerically against your calculation we have $w = 787$ so $k = 3$, which gives

$$n \approx \sqrt{ \frac{2000}{\log_{10} \varphi} } = 97.8 \dots $$

which is off by a factor of $2$ but at least the right order of magnitude. This estimate could be improved by being more careful about the expected value heuristic at the end; precise information is known about how long you have to wait for a given string to show up in a sequence of random letters.

Qiaochu Yuan
  • 468,795