Closed formula for probability of n-digit numbers containing three consecutive sixes

Question

I'm trying to find a closed formula $f(n)$ for the probability of choosing a number with $n$ digits that contains at least three consecutive sixes. Ideally, the formula should not depend on $f(n-1)$. Here's what I've figured out so far:

The total number of $n$-digit numbers (excluding numbers starting with 0) is $9 \cdot 10^{n-1}$.

By classifying all the different cases and counting them, I've worked out the probabilities for small $n$:

For $n = 3$: $f(3) = \frac{1}{900}$
For $n = 4$: $f(4) = \frac{18}{9000}$
For $n = 5$: $f(5) = \frac{261}{90000}$
For $n = 6$: $f(6) = \frac{3420}{900000}$
For $n = 7$: $f(7) = \frac{42291}{9000000}$
For $n = 8$: $f(8) = \frac{503748}{90000000}$

My approach involves categorizing the favorable outcomes based on the number of digits before and after the '666' sequence. For example, for $n = 8$:

$zxxxy666$ (81000 cases)
$zxxy666y$, $zxy666yx$, $zy666yxx$ (218700 cases)
$k666yxxx$ (72000 cases)
$666yxxxx$ (90000 cases)

Where $k \in \{1,\ldots,9\} \setminus \{6\}$, $z \in \{1,\ldots,9\}$, $y \in \{0,\ldots,9\} \setminus \{6\}$, and $x \in \{0,\ldots,9\}$.

I also subtract cases where '666' appears twice:

$666y6666$, $6666y666$ (18 cases)
$k666y666$, $666yy666$, $666y666y$ (234 cases)

And I add this to the total number of cases for $n = 7$ before removing duplicates, which is $42300$. This gives $503748$ total favorable outcomes for $n = 8$, which I've also verified with code and it's correct.

However, I'm stuck at $n = 9$. These are the cases I've considered:

$n = 8$ cases ($504000$ cases)
$zxxxxy666$ (810000 cases)
$zxxxy666y$, $zxxy666yx$, $zxy666yxx$, $zy666yxxx$ (291600 cases)
$k666yxxxx$ (720000 cases)
$666yxxxxx$ (900000 cases)
$666y66666$, $6666y6666$, $66666y666$ (-27 cases)
$k666y6666$, $k6666y666$ (-144 cases)
$666yy6666$, $666y6666y$, $6666yy666$, $6666y666y$ (-324 cases)
$zy666y666$ (-729 cases)
$k666y666y$, $k666yy666$ (-648 cases)
$666yxy666$, $666y666yx$ (-1620 cases)
$666yy666y$ (-729 cases)

My calculations yield $3221379$ favorable outcomes, but the correct answer should be $5845131$, which I found via counting all the cases between $100000000$ and $999999999$ with code. I can't identify where my counting goes wrong. Besides, while this approach produced correct results up to $n = 8$, I'm sure there must be some easier way.

I've also found the A255373 sequence on OEIS, which starts very similar, but it's slightly different. Counting all the cases with code, I obtain $503748$, $5845131$ after $42291$, and not $503757$, $5845383$. This is the code I've used, it's pretty straightforward:

k = 0; Do[If[StringContainsQ[ToString[n], "666"], k++], {n, 10000000, 99999999}]; Print[k]

What I'm looking for:

Help identifying the error in my counting for $n = 9$
Guidance on finding a general closed formula $f(n)$ for this probability
Insights on how to approach this type of combinatorial problem systematically

I hope it's clear, it's not easy to summarize the process I've used. Any help or hints would be greatly appreciated!

Edit: The recursive approach is probably the simplest solution. If anyone's curious what a closed form derived from the recursive approach would look like, in the form $p(n)=1-\frac{a}{b}$, here you go:

$p(n)=1-\frac{\left(\left(1468 (-22)^{2/3}+176 \sqrt[3]{264077-2045 \sqrt{33}}-22^{2/3} \sqrt[3]{49094315 \sqrt{33}-3176121307}\right) \left(3-\frac{1}{6} \left(1+i \sqrt{3}\right) \sqrt[3]{1215-81 \sqrt{33}}-\frac{1}{2} \left(1-i \sqrt{3}\right) \sqrt[3]{3 \left(15+\sqrt{33}\right)}\right)^n+\left(1468\ 22^{2/3}+\sqrt[3]{264077-2045 \sqrt{33}} \left(176+\sqrt[3]{5809694-44990 \sqrt{33}}\right)\right) \left(3+\frac{1}{3} \sqrt[3]{1215-81 \sqrt{33}}+\sqrt[3]{3 \left(15+\sqrt{33}\right)}\right)^n+\left(176 \sqrt[3]{264077-2045 \sqrt{33}}+\sqrt[3]{-1} 22^{2/3} \left(\sqrt[3]{49094315 \sqrt{33}-3176121307}-1468\right)\right) \left(3-\frac{1}{6} \left(1-i \sqrt{3}\right) \sqrt[3]{1215-81 \sqrt{33}}-\frac{1}{2} \left(1+i \sqrt{3}\right) \sqrt[3]{3 \left(15+\sqrt{33}\right)}\right)^n\right)}{2^{n} 5^{n-1} 2673 \sqrt[3]{ 264077-2045 \sqrt{33}}}$

The number of integers of length $n$ without a $666$ can be computed by recognizing that such an integer must begin with one of $X, 6Y, 66Y$ where $x\in {1, \cdots, 9}$ and $Y\in {0, \cdots 9}$. Thus $a_n=9a_{n-1}+10a_{n-2}+10a_{n-3}$. Alas, $x^3=9x^2+10x+10$ does not have pleasant roots, so closed formulas won't be much use. Approximations are available though. — lulu, Jul 12 '24 at 23:18
Note: neglected to point out that neither $X$ nor $Y$ can be $6$. Thus the recursion should have been $a_n=8a_{n-1}+9a_{n-2}+9a_{n-3}$. The main conclusion is unchanged, however, as the associated cubic also fails to have pleasant roots. Worth remarking that it has two roots of small nor, and one big root, so for large $n$ only powers of the big root matter. That makes approximations possible. — lulu, Jul 12 '24 at 23:45

Henry · Accepted Answer · 2024-07-13T13:43:26.173

Here is an attempt at a systematic approach. Let

$a(n)$ be the number of $n$-digit numbers with no leading zero and at least one 666 pattern
$b(n)$ be the number of $n$-digit numbers possibly with a leading zero and at least one 666 pattern
$c(n)$ be the number of $n$-digit numbers possibly with a leading zero and no 666 pattern
$d(n)$ be the number of $n$-digit numbers possibly with a leading zero and no 666 pattern ending with a non-6
$e(n)$ be the number of $n$-digit numbers possibly with a leading zero and no 666 pattern ending with a single 6
$f(n)$ be the number of $n$-digit numbers possibly with a leading zero and no 666 pattern ending with a double 6

You can then say

$a(n)=b(n)-b(n-1)$
$b(n)=10^n - c(n)$
$c(n)=d(n)+e(n)+f(n)$
$d(n)=9\big(d(n-1)+e(n-1)+f(n-1)\big)$
$e(n)=d(n-1)$
$f(n)=e(n-1)$

If you apply this, starting with $c(0)=d(0)=1$ and $a(0)=b(0)=e(0)=f(0)=0$ then you will get $a(9)=5845131$ as you found and $a(10)=66520530$.

You can manipulate these to give (at least after the initial terms)

$d(n)=9d(n-1)+9d(n-2)+9d(n-3)$ starting with $d(1)=9, d(2)=90, d(3)=900$
$c(n)=9c(n-1)+9c(n-2)+9c(n-3)$ starting with $c(1)=10, c(2)=100, c(3)=999$
$b(n)=19b(n-1)-81b(n-2)-81b(n-3)-90b(n-4)$ starting with $b(1)=b(2)=0, b(3)=1, b(4)=19$
$a(n)=19a(n-1)-81a(n-2)-81a(n-3)-90a(n-4)$ starting with $a(1)=a(2)=0, a(3)=1, a(4)=18$

and it is the last of these which you seem to want.

There is a closed form involving powers of complex numbers, but it would not be pretty. An approximation is $$a(n) \approx 0.9\times 10^n-0.9014500448\times9.9909755900493^n$$ where $9.9909755900493$ represents the real root of $x^3-9x^2-9x-9=0$. This would have suggested $a(9) \approx 5845130.99$, about $0.01$ away from the correct integer.

I see, I tried RSolve on that. I guess recurrence is good enough anyway. — PianothShaveck, Jul 13 '24 at 00:26

score 2 · Answer 2 · answered Jul 13 '24 at 01:28

The problem can be attacked by using Inclusion-Exclusion, with Stars and Bars used internally. First, briefly see this self-answer question which uses the methodology to attack a similar question, and also provides links to other problems that were conquered by the same methodology.

I am not suggesting that you read the self-answer question in detail. Instead, I am suggesting that you merely browse the question to get the idea of what is going on.

For simplicity, I will assume that the leftmost digit of the $~n$-digit number is permitted to equal $~0.$

Let $~n~$ be some fixed positive integer $~\geq 4.$

Let $~N~$ denote the number of such $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's.

Then, since the desired probability is

$$1 - \frac{N}{10^n},$$

the entire problem is reduced to provided a closed form formula for $~N,~$ as a function of $~n.$

Let $~S~$ denote the collection of all possible $~n$-digit numbers.

For $~k \in \{1,2,\cdots,n-2\},~$ let $~S_k~$ denote the subset of $~S~$ that specifically contains $~3~$ consecutive 6's, starting in position $~k.~$ Here, I am reading the positions from left to right.

For example, any element in $~S_1~$ will be an $~n$-digit number that contains $~3~$ consecutive 6's in the leftmost $~3~$ digit positions, and may or may not also contain one or more other occurrences of $~3~$ consecutive 6's.

Then, the desired computation is

$$N = | ~S ~| - | ~S_1 \cup S_2 \cup \cdots \cup S_{n-2} ~|. \tag1 $$

$\underline{\text{General Considerations For Inclusion-Exclusion}}$

Let $~T_0~$ denote $~| ~S ~| \implies $

$$T_0 = 10^n.~$$

Let $~T_1~$ denote $~\displaystyle \sum_{1 \leq i_1 \leq n-2} | ~S_{i_1} ~|.$
That is, $~T_1~$ represents the sum of $~\displaystyle \binom{n-2}{1} ~$ terms.

By considerations of symmetry,

$$T_1 = (n-2) \times 10^{n-3}.$$

For $~r \in \{2,3,\cdots,n-2\},~$
let $~T_r~$ denote $~\displaystyle \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq n-2} | ~S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r} ~|.$
That is, $~T_r~$ represents the sum of $~\displaystyle \binom{n-2}{r} ~$ terms.

Then, in accordance with Inclusion-Exclusion theory,

$$N = \sum_{r = 0}^{n-2} (-1)^r T_r.$$

However, considerations of symmetry, that were useful when computing $~T_1,~$ break down when computing $~T_r ~: ~r \geq 2.$

So, the entire problem is reduced to using Stars and Bars internally to provide a closed form formula for $~T_r ~: ~r \geq 2.$

The remainder of this answer will :

Analytically develop a helper function $~f(n,r,w,o)~$ and explain how this function may be used to compute $~T_r ~: ~r \geq 2.$
In this section, the variables $~n,r,w,o~$ will be specified and discussed.
Provide upper and lower bounds on the variables $~n,r,w,o~$ where appropriate.
Summarize all of the results, with a (very convoluted) closed form formula.

$\underline{\text{Helper Function} ~f(n,r,w,o)}$

This section will partition the $~\displaystyle \binom{n-2}{r}~$ intersections represented by the term $~T_r ~: ~r \geq 2~$ into categories. The purpose of the function $~f(n,r,w,o)~$ is to enumerate how many $~n$-digit numbers pertain to each category.

At the start of this section, for illustrative purposes, I will asssume that $~n = 20,~$ and $~r = 5.~$ Then, later in this section, I will discuss the general case of $~n \geq 4, ~r \in \{2,3,\cdots, n-2\}.$

Consider the following tableau:

i_1 - - i_2 - - i_3 - - i_4 - - i_5 - - - - -

In the above tableau, the positions of $~i_1, \cdots, i_5,~$ are $~1, 4, 7, 10, 13,~$ respectively. This represents the intersection of the subsets $~S_{1} \cap S_4 \cap S_7 \cap S_{10} \cap S_{13},~$ which represents one of the intersections pertinent to the computation of $~T_5.$

The $~5~$ subsets involved create $~(5 +1)~$ islands between the subsets. Reading the islands from left to right, let $~x_1, \cdots, x_6,~$ denote the respective sizes of these islands. Then, you have that the ordered $~6$-tuplet $~(x_1, \cdots, x_6) = (0,2,2,2,2,5). ~$ Note that since $~5~$ of the $~(20-2)~$ digit positions are taken by the positions of $~i_1, \cdots, i_5,~$ you must have that $~x_1 + \cdots + x_6 = (20 - 2) - 5 = 13.$

Then, the number of distinct ordered $~6$-tuplets $~(x_1,\cdots,x_6)~$ equals the number of solutions to

$x_1 + \cdots + x_6 = 13.$
$x_1, \cdots, x_6 \in \Bbb{Z_{\geq 0}}.$

By basic Stars and Bars theory, the number of solutions is
$\displaystyle \binom{13 + [6-1]}{6-1} = \binom{18}{5}.$

So, with $~n = 20, ~$ each possible intersection of $~5~$ subsets that is pertinent to the computation of $~T_5~$ is uniquely represented by one of the satisfying ordered $~6$-tuplets $~(x_1, \cdots,x_6).$

Speaking more generally, what is needed is some method of partitioning the $~\displaystyle \binom{n-2}{r} ~$ pertinent intersections into mutually exclusive categories, where each category may be analytically diagnosed.

To do this, the variables $~x_1~$ and $x_{r+1}~$ may be ignored. For the other $~(r - 1)~$ variables, I am going to let $~w~$ denote the number of such variables that are $~\geq 2~$ and let $~o~$ denote the number of such variables that are $~= 1.~$ Then, it is to be understood that exactly $~(r - 1) - w - o~$ of these variables are equal to $~0.$

So, the categories will be based on the values of the variables $~n,r,w,o.$

The helper function $~f(n,r,w,o)~$ will enumerate how many $~n$-digit numbers are possible that pertain to the category represented by the specific values of the variables $~n, ~r, ~w,~$ and $~o.~$ The following procedure will be used to compute $~f(n,r,w,o):$

Step 1
Enumerate the number of possible ways that you can choose $~w~$ variables from $~x_2, \cdots, x_r,~$ and then choose $~o~$ variables from the remaining $~(r-1-w)~$ variables.
Step2
Assume that (reading the variables from left to right, starting with variable $~x_2$) that the first $~w~$ variables are $~\geq 2,~$ and the next $~o~$ variables are $~= 1.~$ Then, under this assumption, identify the number of intersections that correspond to this assumption.

The intention is that the product of the computations in Step 1 and Step 2 will determine how many of the $~\displaystyle \binom{n-2}{r}~$ intersections correspond to the category represented by the specific set of values for $~n, ~r, ~w, ~$ and $~o.$
Step 3
For any intersection of $~r~$ of the subsets of $~S_1, S_2, \cdots, S_{n-2},~$ that pertains to the category represented by the specific values of the variables $~n, ~r, ~w, ~$ and $~o, ~$ compute the number of $~n$-digit numbers that are represented by this intersection.
Step 4
Take the combined product of the computations in each of Steps 1 through 3. The result will be the function $~f(n,r,w,o).$

The Step 1 computation is:

$$\binom{r-1}{w} \times \binom{r-1-w}{o}.$$

The analysis for the Step 2 computation requires discussion. In the basic Stars and Bars enumeration, you start with $~(r + 1)~$ variables, whose sum is $~(n-2 - r).~$ Eliminating the $~(r - 1 - w - o) ~$ variables that are $~= 0~$ reduces the number of variables to $~(2 + w + o),~$ and leaves the sum of $~(n - 2 - r)~$ unchanged.

Then, eliminating the $~o~$ variables that are $~= 1~$ reduces the number of variables to $~(w + 2),~$ and reduces the sum to $~(n - 2 - r - o).$ So, at this point, the Step 2 computation is represented by the number of solutions to

$x_1 + y_2 + \cdots + y_{w+1} + x_{r+1} = (n - 2 - r - o).$
$x_1, x_{r+1} \in \Bbb{Z_{\geq 0}}.$
$y_2, \cdots, y_{w+1} \in \Bbb{Z_{\geq 2}}.$

Now employ the further change of variables $~z_i = y_i - 2 ~: ~i \in \{2,3,\cdots,w+1\}.$
This leaves the number of variables unchanged at $~(w + 2)~$ and reduces the sum to $~(n - 2 - r - 2w - o).$ Therefore, the Step 2 computation is represented by the number of solutions to

$x_1 + z_2 + \cdots + z_{w+1} + x_{r+1} = (n - 2 - r - 2w - o).$
$x_1, x_{r+1} \in \Bbb{Z_{\geq 0}}.$
$z_2, \cdots, z_{w+1} \in \Bbb{Z_{\geq 0}}.$

By basic Stars and Bars theory, the Step 2 computation is therefore

$$\displaystyle \binom{[n - 2 - r - 2w - o] + [w + 1]}{w + 1} = \binom{n - 1 - r - w - o}{w+1}.$$

The analysis for the Step 3 computation also requires discussion. First, suppose that $~w = (r-1),~$ which implies that $~o = 0, ~(r - 1 - w - o) = 0. ~$ Then, each pair of subsets $~S_i~$ and $~S_{i+1},~$ has at least two digit positions between them. Therefore, you can regard this pair of subsets as fully uncompressed. Then, of the $~n~$ digit positions, exactly $~(3r)~$ of them are forced to equal $~6.~$

Now, assume instead that there are $~o~$ of the $~(r - 1) ~$ variables that are $~= 1.~$ Then, you have $~o~$ partially compressed pairs of subsets that (together) use $~5~$ digit positions instead of $~6~$ digit positions. This implies that you now have $~(3r - o)~$ digit positions that are forced to equal $~6.~$

Now, further assume that you have some fully compressed pairs of subsets because $~(r - 1 - w - o) \neq 0.~$ Similar to the analysis in the previous paragraph, this implies that exactly $~[ ~(3r - o) - 2(r - 1 - w - o) ~] = [ ~r + 2 + 2w + o ~] ~$ of the digit positions are forced to equal $~6.~$

So, with respect to the $~n~$ digit positions, you have exactly

$[ ~r + 2 + 2w + o ~]~$ digit positions that are forced to equal $~6.~$
$[ ~n ~] - [ ~r + 2 + 2w + o ~] = [ ~n - 2 - r - 2w - o ~]~$ digit positions that are unspecified.

Therefore, the computation for Step 3 is

$$10^{ ~[n - 2 - r - 2w - o] ~}.$$

Therefore, by Step 4,

$$f(n,r,w,o) = \binom{r-1}{w} \times \binom{r-1-w}{o} \times \binom{n - 1 - r - w - o}{w+1}$$

$$\times 10^{ ~[n - 2 - r - 2w - o] ~}.$$

The next section in this answer will specify the allowable ranges for each of the variables $~n, ~r, ~w,~$ and $~o.$

So, by the analysis in this section, for this specific value of $~n~$ and $~r,~$

$$T_r = \sum_{w ~\text{in range}} \left\{ \sum_{o ~\text{in range}} ~[ ~f(n,r,w,o ~] \right\}.$$

$\underline{\text{Lower And Upper Bounds For} ~n, ~r, ~w, ~\text{And} ~o}$

You have that

$\displaystyle f(n,r,w,o) = \binom{r-1}{w} \times \binom{r-1-w}{o} \times \binom{n - 1 - r - w - o}{w+1}$

$\displaystyle \times 10^{ ~[n - 2 - r - 2w - o] ~}.$
$n \in \Bbb{Z_{\geq 4}}.$
With respect to the appropriate range of $~r,~$ so that $~f(n,r,w,o)~$ may be applied, the lower bound for $~r~$ is $~2 \leq r.$

My experience with math problems of this nature has taught me that the easiest way to compute the upper bound for $~r,~$ and the lower/upper bounds for $~w~$ and $~o,~$ is to consider that for any pertinent expression of the form $~\displaystyle \binom{p}{q} ~: ~p,q \in \Bbb{Z},~$ you must have $~q \geq 0, ~p \geq q. ~$ Further, the expression $~[ ~n - 2 - r - 2w - o ~] ~$ must be a non-negative integer.

Therefore, the first take on the bounding constraints is that

$0 \leq w \leq r-1.$
$0 \leq o \leq r-1-w.$
$n - 2 - r - 2w - o \geq 0.$

The upper bound for $~r~$ is achieved when $~0 = w,o.$ So, the range of $~r~$ is

$$r \in \Bbb{Z}, ~2 \leq r \leq n-2.$$

Both of the variables $~w~$ and $~o~$ can be as small as $~0,~$ since this merely represents having the intersection $~\{ ~S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r} ~\}~$ fully compressed.

The upper bound for $~w~$ may be determined by setting $~o = 0.~$

Therefore, the range of $~w~$ is

$$w \in \Bbb{Z}, ~0 \leq w \leq \min\left\{ ~r-1, ~\left\lfloor ~\frac{n-2-r}{2} ~\right\rfloor ~\right\}.$$

Similarly, the range of $~o~$ is

$$o \in \Bbb{Z}, ~0 \leq o \leq \min\left\{ ~r-1-w, ~n - 2 - r - 2w ~\right\}.$$

$\underline{\text{Closed Form Formula For The Problem}}$

It is assumed that $~n~$ is some fixed positive integer $~\geq 4,~$ and that the leftmost digit of an $~n$-digit number is permitted to equal $~0.$

The desired computation is

$$1 - \dfrac{N}{10^n} ~: ~N = \sum_{r=0}^{n-2} (-1)^r T_r.$$

$$T_0 = 10^n.$$

$$T_1 = (n - 2) \times 10^{n-3}.$$

The remainder of this section assumes that $~r \in \Bbb{Z}, ~2 \leq r \leq n-2.$

The helper function $~f(n,r,w,o)~$is defined as

$$f(n,r,w,o) = \binom{r-1}{w} \times \binom{r-1-w}{o} \times \binom{n - 1 - r - w - o}{w+1}$$

$$\times 10^{ ~[n - 2 - r - 2w - o] ~}.$$

The ranges for the variables $~w~$ and $~o~$ are

$$w \in \Bbb{Z}, ~0 \leq w \leq \min\left\{ ~r-1, ~\left\lfloor ~\frac{n-2-r}{2} ~\right\rfloor ~\right\}.$$

$$o \in \Bbb{Z}, ~0 \leq o \leq \min\left\{ ~r-1-w, ~n - 2 - r - 2w ~\right\}.$$

Then

$$T_r = \sum_{w ~\text{in range}} \left\{ \sum_{o ~\text{in range}} ~[ ~f(n,r,w,o ~] \right\}.$$

Nic · Answer 3 · 2024-09-20T10:50:33.177

Let us define $g(n)$ to be the number of $n$-digit numbers that contain at least 3 consecutive sixes. Then, to find a recursive pattern let us find $g(n+1)$ in terms of $g(n)$. Let us start by noting to turn a $n$-digit number into an $n+1$ digit number, we will multiply by 10 and add $k$ for $k \in \{0,...,9\}$. If our $n$-digit number does contain three consecutive sixes, then our $n+1$ digit number will as well. This means that we have: $$ g(n+1) = 10*g(n) + ... $$ The $...$ is from when the $n$-digit number does not contain at least three consecutive sixes but the $n+1$ digit number does. This will only be the case when the 666 is in the rightmost spot of the $n+1$ digit number and hence the $n$-digit number will end in $x66$ where $x\neq 6$. We also need that the digits before this do not have a 666 in them. Hence this number is the total number of $n-3$ digit number minus the number of $n-3$ digit numbers with 666, times 9 (because of the freedom of $x$) i.e. $$ (9*10^{n-4} - g(n-3))*9 $$ I note that we need $n \ge 4$ for this to work. This gives us: $$ g(n+1) = 10*g(n) + (9*10^{n-4} - g(n-3))*9 $$ Then, dividing by $9*10^n$ we get: $$ \frac{g(n+1)}{9*10^n} = \frac{g(n)}{9*10^{n-1}} + \left(\frac{9*10^{n-4}}{9*10^n} - \frac{g(n-3)}{9*10^n}\right)*9$$ Then the function you want, $f(n)$ can be found by: $$ f(n) = \frac{g(n)}{9*10^{n-1}} $$ This gives us: $$ f(n+1) = f(n) + (1 - f(n-3)) * \frac{9}{10^4} $$ From here, we will use $9/10^4 =: \lambda$. Now, using $n+1 \mapsto n$ we get: $$ f(n) = f(n-1) + (1 - f(n-4)) *\lambda $$

Getting rid of the first term by this same recursion relation we get: $$ f(n) = f(4) + ((n-4) - \sum_{k=3}^{n-4} f(k))*\lambda$$ We can see that for $n<7$ the sum does not contribute to the sum so we get: $$ f(4\le n<7) \approx f(4) + (n-4)*\lambda$$ If we want to go to first order in $\lambda$ for $n\ge 7$ then we have: $$ f(n\le 7) \approx f(4) + (n-4)*\lambda - (n-7)*f(4)*\lambda - f(3)*\lambda$$ For simplicity I will assume $f(3) \approx f(4)$ just to eliminate one term: $$f(n\le 7) \approx f(4) + \big(n-4 - (n-6)*f(4))*\lambda$$ We could continue expanding in terms of $\lambda$ to get a more accurate answer.

score 1 · Answer 4 · answered Jul 13 '24 at 19:53

My other answer used Inclusion-Exclusion to provide a very convoluted closed form formula. This answer provides a sanity-check against the other answer.

Similar to my other answer, in this answer, I am also assuming that the leftmost digit of an $~n$-digit number is permitted to equal $~0.$

If the original poster is interested, they could write a computer program against the closed form formula in my other answer, to compute the number of $~n$-digit numbers that do not contain $~3~$ consecutive 6's. They could (for example) have the computer program compute the corresponding values for each $~n \in \{4,5,6,\cdots,30\}.$

Then, they could write a separate computer program that applies the recursion formulas in this answer and again computes the corresponding values for each $~n \in \{4,5,6,\cdots,30\}.$

Then, they could compare the two results.

The answer of nic uses recursion to enumerate the number of $~n$-digit numbers that contain at least one occurrence of $~3~$ consecutive 6's. This answer uses recursion from the opposite perspective.

Let $~N~$ denote the number of $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's.

Then, since the desired probability is

$$1 - \frac{N}{10^n},$$

the entire problem may be reduced to using recursion to enumerate $~N.$

In the discussion below, I will use the symbol X to represent any element of $~\{0,1,2,3,4,5,7,8,9\}.~$ That is, X represents any digit other than $~6.$

Let:

$f(n,0)~$ denote the number of $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's, and that end in X.
$f(n,1)~$ denote the number of $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's, and that end in X6.
$f(n,2)~$ denote the number of $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's, and that end in X66.
$f(n)~$ denote the number of $~n$-digit numbers that do not contain any occurrence of $~3~$ consecutive 6's.

Then, you have the following recursion formulas, valid for $~n \in \Bbb{Z_{\geq 3}}.$

$f(n) = f(n,0) + f(n,1) + f(n,2).$
$f(n+1,0) = 9 \times f(n).$
That is, you have $~9~$ choices for which digit to append to the end of the $~n$-digit number.
$f(n+1,1) = f(n,0).$
That is, you must start with an $~n$-digit number that ends in $~X,~$ and specifically append a $~6~$ to the end of the number.
$f(n+1,2) = f(n,1).$
That is, you must start with an $~n$-digit number that ends in $~X6,~$ and specifically append a $~6~$ to the end of the number.

Using the above formulas, and manually computing the entries that correspond to $~n=3~$ leads to the following table:

\begin{array}{| r | r | r | r | r |} \hline n & f(n,0) & f(n,1) & f(n,2) & f(n) \\ \hline 3 & 900 & 90 & 9 & 999 \\ \hline 4 & 8991 & 900 & 90 & 9981 \\ \hline 5 & 89829 & 8991 & 900 & 99720 \\ \hline \end{array}

Markus Scheuer · Answer 5 · 2024-07-15T06:04:54.380

We use a generating function approach to derive a formula for the wanted sequence of probabilities. In order to do so we reformulate the problem a bit. We consider an alphabet $\mathcal{V}=\{0,1,2,3,4,5,6,7,8,9\}$ and we are looking for the numbers $a_n, n\geq 0$ of valid words of length $n$ built from this alphabet. A word is valid if it contains $666$ and does not start with $0$. The wanted sequence of probabilities is then \begin{align*} \left(a_n/10^{n}\right)_{n\geq 0} \end{align*}

We start with deriving a generating function \begin{align*} \color{blue}{B(z)=\sum_{n=0}^{\infty}b_nz^n} \end{align*} where $b_n$ denotes the number of words built from $\mathcal{V}$ which does not contain three consecutive $666$. The number of words of length $n$ which does contain $666$ is consequently \begin{align*} 10^n-b_n \end{align*} A generating function which produces $10^n$ at the $n$-th position is the geometric series \begin{align*} \frac{1}{1-10z}=1+10z+100z^2+1\,000z^3+\cdots \end{align*} Denoting with $[z^n]$ the coefficient of $z^n$ of a series we write the number $10^n-b_n$ of words of length $n$ which contain $666$ as \begin{align*} [z^n]\left(\frac{1}{1-10z}-B(z)\right)\tag{1} \end{align*} We do not want to count words which start with $0$. We can manage this by subtracting from (1) all words of length $n-1$ which contain $666$. We obtain \begin{align*} \color{blue}{a_n}&=\left([z^n]-[z^{n-1}]\right)\left(\frac{1}{1-10z}-B(z)\right)\\ &=\left([z^n]-[z^{n}]z\right)\left(\frac{1}{1-10z}-B(z)\right)\\ &\,\,\color{blue}{=[z^n]\left(\frac{1-z}{1-10z}-(1-z)B(z)\right)}\tag{2} \end{align*}

Calculation of $B(z)$:

This calculation is based upon the Goulden-Jackson Cluster Method. We consider the set of words of length $n\geq 0$ built from the alphabet $\mathcal{V}=\{0,1,\ldots,9\}$ and the set $$B=\{666\}$$ of bad words, which are not allowed to be part of the words we are looking for. We derive a generating function $B(z)$ with the coefficient of $z^n$ being the number of wanted words of length $n$. According to the paper (p.7) the generating function $B(z)$ is \begin{align*} \color{blue}{B(z)=\frac{1}{1-dz-\text{weight}(\mathcal{C})}}\tag{3} \end{align*} with $d=|\mathcal{V}|=10$, the size of the alphabet and $\mathcal{C}$ is the weight-numerator of bad words with \begin{align*} \color{blue}{\text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[666])}\tag{4} \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[666])&=-z^3-z^2\cdot\text{weight}(\mathcal{C}[666])-z\cdot\text{weight}(\mathcal{C}[666])\\ &\qquad\qquad\qquad\qquad 6{\color{blue}{\underline{66}}}6\qquad\qquad\qquad 66{\color{blue}{\underline{6}}}66\\ \end{align*}

The additional terms on the right-hand side take into account overlaps, which are indicated by the blue letters of the overlapping bad words in the line below. This is in fact an application of the inclusion-exclusion principle in terms of generating functions. It follows from (3) and (4) \begin{align*} \text{weight}(\mathcal{C})=\frac{-z^3}{1+z+z^2}\tag{5} \end{align*}

We obtain from (3) and (5) \begin{align*} \color{blue}{B(z)}&=\frac{1}{1-dz-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-10z+\frac{z^3}{1+z+z^2}}\\ &=\color{blue}{\frac{1+z+z^2}{1-9z\left(1+z+z^2\right)}}\tag{6}\\ \end{align*}

We derive a generating function $A(z)$ for the wanted words from (2) and (6) as \begin{align*} \color{blue}{A(z)}&=\sum_{n=0}^{\infty}a_nz^n\\ &=\frac{1-z}{1-10z}-(1-z)B(z)\\ &=\frac{1-z}{1-10z}-(1-z)\frac{1+z+z^2}{1-9z\left(1+z+z^2\right)}\\ &\,\,\color{blue}{=\frac{1-z}{1-10z}-\frac{1-z^3}{1-9z\left(1+z+z^2\right)}}\tag{7}\\ &=z^3+18z^4+261z^5+3\,420z^6+42\,291z^7\\ &\qquad\quad+503\,748z^8+5\,845\,131z^9+\cdots \end{align*} where the last line was calculated with some help of WolframAlpha.

Calculation of coefficients $a_n$:

We calculate $a_n$ by expanding the geometric series $\frac{1}{1-9z(1+z+z^2)}$. We obtain \begin{align*} \color{blue}{c_n}&:=[z^n]\frac{1}{1-9z(1+z+z^2)}\\ &=[z^n]\sum_{j=0}^{\infty}(9z)^j\left(1+z+z^2\right)^j\tag{8.1}\\ &=\sum_{j=0}^n9^j[z^{n-j}]\left(1+z+z^2\right)^j\tag{8.2}\\ &=\sum_{j=0}^{n}9^{n-j}[z^j]\left(1+z+z^2\right)^{n-j}\tag{8.3}\\ &=\sum_{j=0}^{n}9^{n-j}[z^j]\sum_{k=0}^{n-j}\binom{n-j}{k}\left(z+z^2\right)^k\\ &=\sum_{j=0}^{n}9^{n-j}\sum_{k=0}^{n-j}\binom{n-j}{k}[z^{j-k}](1+z)^k\tag{8.4}\\ &\,\,\color{blue}{=\sum_{j=0}^{n}9^{n-j}\sum_{k=\lfloor j/2\rfloor}^{n-j}\binom{n-j}{k}\binom{k}{j-k}}\tag{8.5} \end{align*} We finally get from (7) and (8.5) \begin{align*} {\color{blue}{a_n}}&\color{blue}{=9\times 10^{n-1}-c_n+c_{n-3}\qquad\qquad n\geq 3}\\ \color{blue}{a_0}&\color{blue}{=a_1=a_2=0} \end{align*}

Comment:

In (8.1) we apply the geometric series expansion.
In (8.2) we apply the rule $[z^{p-q}]T(z)=[z^p]z^qT(z)$. We also set the upper limit of the series to $n$ since other terms do not contribute.
In (8.3) we change the order of summation $j\to n-j$ and perform a binomial expansion in the next step.
In (8.4) we apply the rule from (8.2) again and select finally the coefficient of $z^{j-k}$.

Pretty interesting approach. – PianothShaveck Jul 15 '24 at 16:04 — PianothShaveck, Jul 15 '24 at 16:04

Closed formula for probability of n-digit numbers containing three consecutive sixes

5 Answers5