0

In answering another question on this site, I got to wondering. Perhaps the collective wisdom here can satisfy my idle curiosity.

From time to time the question is asked, can you use, say, a $6$-sided die to simulate a $20$-sided die. The usual answer is to roll it twice, subtracting $1$ from each result, consider the result to be a base $6$ integer, add $1$, and then start all over again if the result is too high (i.e., $\gt 20$).

I think this is wasteful. My preferred solution is to consider each roll (again subtracting $1$) as the next digit in a base $6$ fraction. You stop when your fraction gives you a result; for example, when you know your fraction will be between $\frac{16}{20}$ and $\frac{17}{20}$, you declare the answer to be $17$.

My questions: when using a D$m$ to simulate a D$n$ in this way, what is the expected number of rolls necessary to reach a result? Is my technique the most efficient possible, in the sense that it minimizes the expected number of rolls?

Thanks for satisfying my curiosity. I hope you find this question interesting.

Robert Shore
  • 26,056
  • 3
  • 21
  • 49

1 Answers1

2

The key insight to handle this is to look at the probability our process has terminated after $k$ rolls - the cumulative distribution function for the number of rolls. Increasing these probabilities decreases the expected number of rolls, and the minimum is obtained if we can simultaneously maximize the termination probabilities for all $k$.

For the specific case of a $D6$ simulating a $D20$, your method is maximally efficient. There are $6^k$ possibilities for rolling $k$ dice, and we can show that your method terminates with a definite answer for all but $16$ of them. After rolling $k$ dice, each sequence rolled represents an interval of length $6^{-k}$, which fails to return an answer if and only if some $\frac{j}{20}$ is in its interior. This fraction's base-$6$ expansion terminates with at most two digits for $j=5,10,15$, and continues infinitely for the other $16$ values of $j$. Therefore, for $k\ge 2$, the process terminates for $6^k-16$ of the possible sequences.

Since $6^k\equiv 16\mod 20$ for all $k\ge 2$, this is maximal; no method can return a definite answer for more than $20\cdot\left\lfloor\frac{6^k}{20}\right\rfloor$ of the possible roll sequences without overestimating something.

It's not the only maximally efficient method. Let me propose an alternative:

First, roll one die repeatedly until we get something other than a $6$. Record the result.
Second, roll a die. If it's $1,2,3,4$, record the result. If it's $5$ or $6$, roll again. If the roll is $5$ and then odd, record $1$. If $5$ and even, record $2$. If $6$ and odd, record $3$. If $6$ and odd, record $4$.
Combine the two recorded results; $4$ times the first plus the second.

We can show that this has the exact same expected number of die rolls $\frac{38}{15}\approx 2.53$ as your method. What's the advantage? Less to track. At any given stage, we only have to remember at most two previous dice, rather than the whole list. It's viable with mental math, whereas your method would practically require us to write things down.

Now, what about the general problem of using a $Dm$ to simulate a $Dn$? This is where your method falls down a bit. The theoretical limit is a decision in $n\cdot\left\lfloor\frac{m^k}{n}\right\rfloor$ of the length-$k$ sequences, while your method typically makes a decision in $m^k-n-1$ of them. If $m$ and $n$ have common factors, some of the boundaries will have terminating decimal expansions and reduce this a bit, but that can't match the possibility of $m^k$ being small mod $n$. Can we design an alternative method that systematically reaches the theoretical maximum?

Yes, we can. This iterative process will do it:

Initialize x=0, y=1, done=false
While (done is false)
    Multiply x and y by m
    Roll a Dm, add result -1 to x
    If x <= n*floor(y/n) {set done = true}
    Reduce x and y mod n
Return x+1

The final value returned is, of course, the $Dn$ roll. It's not too hard to run the algorithm by hand; there are only two pieces of data $x$ and $y$ we need to track, and they're both bounded above since we keep reducing mod $n$. If we've got extra dice lying around, we could even track them on those extra dice rather than writing anything down.

jmerry
  • 19,943