0

What is the expected number of die rolls to get 6 given that die values do not decrease?

For example, $1, 1, 2, 6$ is accepted and $1, 3, 2, 6$ is not.

amWhy
  • 210,739
  • So $5,5,6$ is a good sequence? – lulu May 13 '20 at 11:34
  • So the probability of $3,1$ for example, is $0$? – Alex May 13 '20 at 11:36
  • 1
    @fraulifang: OK, so If I toss 5, does that mean that the probability of tossing 5 is $\frac{1}{2} (because values 1-4 are discarded)? – Alex May 13 '20 at 11:42
  • I suspect the answer is $\frac{197}{60} \approx 3.2833\ldots$. With a two sided die, I think it is $2$, with a three sided die $2.5$, with a four sided die $2.8333\ldots$, and with a five sided die $3.0833\ldots$ – Henry May 13 '20 at 11:52
  • The linked question gives a description – Henry May 13 '20 at 12:10
  • Are we quite sure the linked question is a duplicate? Conditioning is not the same as simply excluding all the bad possibilities from the sample space. That results in well known "paradoxes", as here. Same problem arises here, does it not? As the length increases the probability that the sequence doesn't eventually decrease drops to $0$ rapidly (which would not be true if we removed all the lower balls). – lulu May 13 '20 at 13:18
  • Oh, I agree the questions are different, but the bias is the same. Because you aren't discarding the lower rolls (as you would do in the linked question) you are heavily biased toward very short rolls. Imagine you had a trillion sided die...then each toss is lower than the predecessor with probability $\frac 12$ (roughly) so it is extremely unlikely you'll get a long non-decreasing sequence. Which would not be true if you discarded the lower rolls. – lulu May 13 '20 at 13:25
  • @joriki I think this problem is different. It's like the "paradox" illustrated here. By discarding the possibility of lower rolls you are encouraging long sequences to form. – lulu May 13 '20 at 13:45
  • @lulu - the accepted answer to the linked question has the sequence $0, 2, \frac{5}{2}, \frac{17}{6}, \frac{37}{12}, \frac{197}{60}, \frac{69}{20}, \dotsc$ and I think it is the $\frac{197}{60}$ which is the answer here – Henry May 13 '20 at 13:46
  • @Henry But am I wrong about the bias? It seems to me that dropping the lower terms changes things dramatically. – lulu May 13 '20 at 13:47
  • @lulu - I suppose it depends on the rules of the game. I was taking Alex's approach which seems to involve not counting smaller rolls but not invalidating the whole sequence. I accept that a conditional probability approach which invalidates sequences might produce a different answer – Henry May 13 '20 at 13:50
  • @joriki to take an extreme example, suppose you had an $N$ sided die for a very large $N$, and that you are requiring all the tosses prior to the $N$ to be the same. In your method, having initially thrown an $m$, presumably $\neq N$, you would discard everything but $N,m$ and get an answer of $3$, yes? In reality, however, sequences of the form $m^aN$ are effectively impossible unless $a\in {0,1}$ so the answer is $<2$ – lulu May 13 '20 at 13:51
  • @Henry That's my point. I think the question asked here has a lower expected value, because retaining the lower rolls forces the non-decreasing sequences to be short. – lulu May 13 '20 at 13:52
  • @lulu: You're right. I've reopened the question. – joriki May 13 '20 at 14:02
  • @fraulifang: Sorry about the mistake. I'll make up for it with an answer if no one else does :-) – joriki May 13 '20 at 14:02
  • @lulu: On your conditional probability version I seem to get (unchecked) $\frac{0.82944}{0.41472} = 2$ which is suspiciously round – Henry May 13 '20 at 14:45
  • @Henry I also get $2$ and (like you) I am surprised. That's why I haven't posted my calculation yet. In practice, the answer seems closer to $1.7$...I seldom get a trial of length greater than $3$, the vast majority of trials are $1$ or $2$. – lulu May 13 '20 at 14:47

2 Answers2

3

To be clear: I am assuming that a trial consists of throwing a fair die until a $6$ is thrown, then discarding the sequence if at some point it decreased. We then want the average length of the surviving sequences.

Let $A_n$ be the number of non-decreasing sequences of length $n$ made from $\{1,2,3,4,5\}$.

Since any such sequence is of the form $1^{a_1}2^{a_2}\cdots 5^{a_5}$ with $a_i$ non-negative integers and $\sum a_i=n$, Stars and Bars tells us that $$A_n=\binom {5+n-1}{n}$$.

Thus the probability that you throw a non-decreasing sequence and then throw a $6$ for the first time on the $n^{th}$ roll is $$\frac {A_{n-1}}{6^n}=\binom {5+n-2}{n-1}\times \frac 1{6^n}$$

It follows that the answer is $$E=\frac {\sum_{n=1}^{\infty} n\times \binom {5+n-2}{n-1}\times \frac 1{6^n}}{\sum_{n=1}^{\infty}\binom {5+n-2}{n-1}\times \frac 1{6^n}}=2$$

(Final evaluation checked by WA)

Note: as remarked in the comments, the method works for an $N$ sided die and, for $N>1$, always yields $2$. That strongly suggests that there is an underlying principle involved here, but as yet I have not spotted it.

lulu
  • 76,951
  • My simulation suggests that $E=2$ is correct. – Daniel Mathias May 13 '20 at 15:13
  • Much the same as my approach though I had $a_n= {n+4 \choose 4}$ which makes the sums polynomial – Henry May 13 '20 at 15:15
  • @DanielMathias Does it? Mine doesn't. I get just under $1.7$ with essentially all the sequences of length $1$ or $2$. Of course, the bug could be in my simulation. – lulu May 13 '20 at 15:15
  • 1
    Yes. The error must be in your simulation. – Daniel Mathias May 13 '20 at 15:27
  • 1
    The expression you have for E seems evaluate to $2$ for other sizes of die too. There must be an easier argument to prove this... – Jaap Scherphuis May 13 '20 at 15:40
  • @JaapScherphuis Oh, agreed. If, that is, the calculation is correct. – lulu May 13 '20 at 15:42
  • 1
    Presumably the general claim is $\sum\binom {n+k-3}{n-1}\times \frac 1{k^n} = \frac{k^{k-2}}{(k-1)^{k-1}}$ and $\sum n \times \binom {n+k-3}{n-1}\times \frac 1{k^n} = 2 \times\frac{k^{k-2}}{(k-1)^{k-1}}$ – Henry May 13 '20 at 15:59
  • @DanielMathias Yes, blunder in the simulator. Now matching the number...but it still doesn't sit right. I am looking for a simpler approach. – lulu May 13 '20 at 16:15
  • @lulu: I posted an answer which I think makes the simple result appear slightly less mysterious (and easier to generalize). By the way, you're using $a_n$ for two different things. – joriki May 14 '20 at 04:26
  • @joriki Thanks! I'll change one of the $a_i$ – lulu May 14 '20 at 09:44
2

Let’s see how often we expect to roll a particular number $k$, given that the results don’t decrease. Whatever other numbers we roll in non-decreasing order, ending in a $6$, there’s exactly one slot at which we could insert any number $m$ of $k$s and still have a non-decreasing sequence. This insertion multiplies the probability for the sequence by $p^m$, where $p=\frac1n$ is the probability to roll $k$ (where $n$ is the number of sides of the die). So the conditional expectation for the number $M$ of $k$s factorizes, and the probabilities for the other numbers cancel:

\begin{eqnarray} \mathsf E[M\mid\text{non-decreasing}] &=& \frac{\sum_{S\in\mathcal S}\sum_{m=0}^\infty\mathsf P(S)p^mm}{\sum_{S\in\mathcal S}\sum_{m=0}^\infty\mathsf P(S)p^m} \\ &=& \frac{\left(\sum_{S\in\mathcal S}\mathsf P(S)\right)\sum_{m=0}^\infty p^mm}{\left(\sum_{S\in\mathcal S}\mathsf P(S)\right)\sum_{m=0}^\infty p^m} \\ &=& \frac{\sum_{m=0}^\infty p^mm}{\sum_{m=0}^\infty p^m} \\ &=& \frac{p/(1-p)^2}{1/(1-p)} \\ &=& \frac p{1-p} \\ &=& \frac1{n-1} \;, \end{eqnarray}

where $\mathcal S$ is the set of admissible sequences of the other numbers.

Thus, each of the $n-1$ numbers other than $6$ is expected to appear $\frac1{n-1}$ times, so in total we expect one number to appear before the $6$. Together with the one $6$, that makes an expected total of $2$ rolls.

We can use this approach to solve the problem for a general die with probability $p_k$ for the $k$-th face to appear. The equiprobability of the faces wasn’t used until the last step, so in this case the expected number of rolls is

$$ 1+\sum_{k=1}^{n-1}\frac{p_k}{1-p_k}\;. $$

We can also ask how many rolls we’d expect given that the results are strictly increasing. This is

\begin{eqnarray} \mathsf E[M\mid\text{increasing}] &=& \frac{\sum_{m=0}^1p^mm}{\sum_{m=0}^1p^m} \\ &=& \frac p{1+p}\;, \end{eqnarray}

so in this case the expected number of rolls is

$$ 1+\sum_{k=1}^{n-1}\frac{p_k}{1+p_k}\;. $$

For equiprobable sides, this is

$$ 1+\sum_{k=1}^{n-1}\frac{\frac1n}{1+\frac1n}=\frac{2n}{n+1}\;, $$

which is slightly less than $2$ but goes to $2$ as $n\to\infty$ and equal sides become unlikely.

As might be expected, the result for non-decreasing sequences is minimized for equiprobable faces and goes to infinity as one of the probabilities $p_k$ approaches $1$, whereas the result for increasing sequences is maximized for equiprobable faces.

joriki
  • 242,601
  • 1
    I think is clearly the better argument, though given the (to me) unintuitive nature of the result, I find it comforting to have multiple approaches. – lulu May 14 '20 at 09:54