Mass Function for a Geometric distribution with two non-fail outcomes.

Question

Let's say we roll a fair $d$-sided die, and if the die rolls $x$ or higher, we add 1 to the success count (let's define this as $h$) and roll again, but if we roll $y$ or lower, we subtract one from the success count and roll again, and if we roll between $x$ and $y$, we stop rolling.

So, we have three potential outcomes of any given roll:

$P$, which has a probability of $\frac{d-x+1}{d}$, adds $1$ to $h$ and has us roll again.
$N$, which has a probability of $\frac{y}{d}$ subtracts $1$ from $h$, and has us roll again
$S$, which has a probability of $\frac{x-y-1}{d}$ and halts the game.

I have two questions I want help on: How can we calculate the probability of an arbitrary score in this game? And how do we generalise this to $m$ dice (where we start with $m$ dice, rerolling each die until that die hits the stop outcome, with $h$ being the accumulated score of all dice)?

I can build a probability generating function for this problem, and indeed it's not particular hard:

$$F(z)=\frac{x-y-1}{d}+\frac{(d-x+1)zF(z)}{d}+\frac{yz^{-1}F(z)}{d}$$

We can then unroll the recursion into:

$$F(z)=\frac{x-y-1}{d-(d-e+1)z-yz^{-1}}$$

The problem is converting this into a Mass Function. We seem to have a Geometric component to the distribution that has two potential outcomes, that even for a single die, there's an infinite number of potential outcomes that can result in any given $h$ - any roll that ends with $h$ more $P$ outcomes than $N$ outcomes needs to be counted, and I'm not clear on the best approach to doing this, let alone what the appropriate strategy would be for $n$ dice.

To get an arbitrary score of $~K,~$ you need $~L~$ successes, $~L-K~$ failures, before you roll the stopping throw. Letting $~f(L)~$ denote the probability of this happening as a function of $~L,~$ then the desired computation is $$\lim_{M \to \infty} \sum_{L=K}^M f(L).$$ Note that if you roll the die $~(2L-K)~$ times, before the first stopping throw, you will need the factor of $~\displaystyle \binom{2L-K}{L}~$ to represent the number of ways of selecting the $~L~$ successes from the $~2L-K~$ throws of the die. — user2661923, Jan 15 '24 at 04:45
See also Binomial Distribution, specifically $\displaystyle \binom{n}{k}p^kq^{(n-k)}.$ — user2661923, Jan 15 '24 at 04:51

heropup · Accepted Answer · 2024-01-15T07:04:36.573

The idea is to first construct a vector-valued random variable that counts the number of successes and failures observed until the stopping criterion is met. Let $(P,N)$ be an ordered pair of nonnegative integers that counts the random number of successes and failures obtained. Additionally, for convenience, let $\theta$ represent the probability of success for a single trial, and $\phi$ be the probability of failure, such that $\theta + \phi < 1$. Then $1 - \theta - \phi$ is the probability of stopping. Consequently, $$\Pr[(P = p) \cap (N = n)] = \binom{p+n}{p} \theta^p \phi^n (1-\theta-\phi), \quad p, n \in \{0, 1, 2, \ldots \}.$$ We reason that a particular sequence of $p$ successes and $n$ failures, followed by a stopping roll, has probability of $\theta^p \phi^n (1-\theta-\phi)$ of occurring; however, there are $\binom{p+n}{p}$ such sequences, each equally probable. As an exercise for the reader, one may confirm that $$\sum_{p=0}^\infty \sum_{n=0}^\infty \Pr[(P = p) \cap (N = n)] = 1.$$ The next step is to compute the net score, which is $X = P - N$. This corresponds to $$\begin{align} \Pr[X = x] &= \Pr[P - N = x] \\ &= \sum_{n=\max(0,-x)}^\infty \Pr[(P = x + n) \cap (N = n)] \\ &= \sum_{n=\max(0,-x)}^\infty \binom{x+2n}{n} \theta^{x+n} \phi^n (1-\theta-\phi) \\ &= \frac{1-\theta-\phi}{\sqrt{1 - 4 \theta \phi}} \begin{cases} \left(\frac{2\theta}{1 + \sqrt{1 - 4 \theta \phi}}\right)^x, & x \ge 0 \\ \left(\frac{1 + \sqrt{1 - 4 \theta \phi}}{2\phi} \right)^x, & x < 0 \end{cases} \\ &= \frac{1-\theta-\phi}{\sqrt{1 - 4 \theta \phi}} \left( \frac{2(\theta \mathbb 1 (x \ge 0) + \phi \mathbb 1 (x < 0))}{1 + \sqrt{1 - 4 \theta \phi}} \right)^{|x|}. \end{align}$$

What this tells us is that $X$ is a kind of generalized double geometric distribution. Here are some plots of $\Pr[X = x]$ for various values of $\theta, \phi$:

Great answer! Can you expand a little on how you solved the infinite sum? It's not obvious to me how I'd get that answer! — Lee Davis-Thalbourne, Jan 15 '24 at 07:38
After some extra research, I'm pretty confident that I know understand where the infinite sum came from. The last question is "How do we generalise this to $m$ dice?" I have a theory that we'd add a multiset coefficient of $\binom{m}{2n+x}$ to our infinite sum. My excel calculation show that this does seem to converge, the question becomes what it converges to? — Lee Davis-Thalbourne, Jan 17 '24 at 01:20

Lee Davis-Thalbourne · Answer 2 · 2025-04-29T23:59:59.853

This question originally had a second part: How do we generalise this for $m$ dice?

heropup's answer points out that this is, in effect, an infinite sum of binomial outcomes. If we have $m$ dice, it's clear that we must then allocate this infinite sum of outcomes to those $m$ dice, which means that we need to add a multiset coefficient to the infinite sum:

$$P(X=x)=\sum_{n=\max(0,-x)}^{\infty}\binom{m+x+2n-1}{x+2n}\binom{x+2n}{n}\theta^{x+n}\phi^n(1-\theta-\phi)^m$$

We can use some creative telescoping similar to this answer to figure out what, exactly, we're looking at:

$$\begin{align*} F(n)=&\binom{m+x+2n-1}{x+2n}\binom{x+2n}{n}\\ =&\frac{(m+x+2n-1)!(x+2n)!}{x!(x+n)!(x+2n)!(m-1)!}\\ =&\frac{(m+x+2n-1)!}{x!(x+n)!(m-1)!}\\ \frac{F(n+1)}{F(n)}=&\frac{(m+x+2(n+1)-1)!}{x!(x+(n+1))!(m-1)!}\cdot\frac{x!(x+n)!(m-1)!}{(m+x+2n-1)!}\\ =&\frac{(m+x+2n+1)!(x+n)!}{(m+x+2n-1)!(x+n+1)!}\\ =&\frac{(m+x+2n)(m+x+2n+1)}{(x+n+1)}\\ =&4\frac{(\frac{m+x+n}{2})(\frac{m+x+n+1}{2})}{(x+n+1)} \end{align*}$$

This is a hypergeometric sum! The infinite sum thus can be represented as:

$$\sum_{n=\max(0,-x)}^{\infty}\binom{m+x+2n-1}{x+2n}\binom{x+2n}{n}\theta^{n}\phi^n=\binom{m+|x|-1}{|x|}\,_2F_1\left(\tfrac{m+|x|}{2},\tfrac{m+|x|+1}{2};|x|+1;4\theta\phi\right)$$

Therefore:

$$P(X=x)=\binom{m+|x|-1}{|x|}(1-\theta-\phi)^m\begin{cases} \,_2F_1\left(\tfrac{m+x}{2},\tfrac{m+x+1}{2};x+1;4\theta\phi\right)\theta^x & \text{ if } x\geq0 \\ \,_2F_1\left(\tfrac{m-x}{2},\tfrac{m-x+1}{2};-x+1;4\theta\phi\right)\phi^{-x} & \text{ if } x<0 \end{cases}$$

The final step is converting the hypergeometric function to something we can actually calculate to get a result. That particular question has been answered here, and taking that answer we get:

$$ \begin{align*} P(X=x)=&\binom{m+|x|-1}{|x|}(1-\theta-\phi)^m\theta^{\max(0,x)}\phi^{-\min(0,x)}2^{|x|}|x|!(4\theta\phi )^{-\frac{|x|}{2}}(1-4\theta\phi )^{-\frac{m}{2}}\\ &\cdot\left( \frac{1-\sqrt{1-4\theta\phi }}{1+\sqrt{1-4\theta\phi }}\right)^{\frac{|x|}{2}}\sum_{j=0}^{m-1}\frac{(m+j-1)!}{j!(m-j-1)!(|x|+j)!}\left (\frac{1-\sqrt{1-4\theta\phi }}{2\sqrt{1-4\theta\phi }} \right )^j\\ =&\binom{m+|x|-1}{|x|}(1-\theta-\phi)^m\theta^{\max(0,x)}\phi^{-\min(0,x)}2^{|x|}|x|!\left (\frac{1}{\sqrt{1-4\theta\phi}} \right )^{m}\\ &\cdot\left( \frac{1-\sqrt{1-4\theta\phi) }}{4\theta\phi}\right)^{|x|}\sum_{j=0}^{m-1}\frac{(m+j-1)!}{j!(m-j-1)!(|x|+j)!}\left (\frac{1-\sqrt{1-4\theta\phi }}{2\sqrt{1-4\theta\phi }} \right )^j\\ =&\sum_{j=0}^{m-1}\frac{(m+j-1)!}{(m-j-1)!}\binom{|x|}{j}\binom{m+|x|-1}{|x|}\frac{(1-\theta-\phi)^m\left (1-\sqrt{1-4\theta\phi } \right )^j}{2^{j}\left (\sqrt{1-4\theta\phi} \right )^{m-j}}\begin{cases} \left( \frac{2\theta\left (1-\sqrt{1-4\theta\phi) } \right )}{4\theta\phi}\right)^{|x|} & \text{ if } x\geq0 \\ \left( \tfrac{2\phi\left (1-\sqrt{1-4\theta\phi) } \right )}{4\theta\phi}\right)^{|x|} & \text{ if } x<0 \end{cases} \\ \end{align*}$$

Note that we can get to heropup's answer by setting $m=1$ - doing this means that the sum can only have one iteration (as in this case $m-1=0$, and therefore $j=0$), and both our binomials equal 1, cancelling them out as well.

Mass Function for a Geometric distribution with two non-fail outcomes.

2 Answers2

Linked