2

Some time back, I asked a question regarding a mass function for a dice roll with the following rules:

  1. We roll $n$ fair, $d$-sided dice (with standard number $1$ to $d$), choosing target numbers $x$ and $y$, such that $d \geq x > y \geq 1$, and $\{n,d,x,y\} \in \mathbb{N}^+$
  2. For every die that rolls at least $x$, we add 1 to our score, then roll that die again
  3. For every die that rolls at most $y$, we subtract 1 from our score, then roll that die again.
  4. If any die rolls between $x$ and $y$, that die terminates.

After a good deal of additional work, I determined that the mass function for the probability of score $h$ on $n$ dice was:

$$\begin{align*}P(H=h)=\binom{n+|h|-1}{|h|}\left(\frac{x-y-1}{d}\right)^n\left(\frac{d-x+1}{d}\right)^{\max[0,h]}\\\cdot\left(\frac{y}{d}\right)^{-\min[0,h]}{}_2F_1\left(\tfrac{n+|h|}{2}, \tfrac{n+|h|+1}{2}, |h|+1, \tfrac{4y(d-x+1)}{d^2}\right)\end{align*}$$

where:

$$\begin{align*}{}_2F_1\left(\tfrac{n+|h|}{2}, \tfrac{n+|h|+1}{2}, |h|+1, \tfrac{4y(d-x+1)}{d^2}\right)=&\left(\frac{2d}{d+\sqrt{d^2-4y(d-x+1)}}\right)^{|h|}\left(\frac{d}{\sqrt{d^2-4y(d-x+1)}}\right)^{n}\\ &\cdot\sum_{j=0}^{n-1}\binom{n-1}{j}\frac{|h|!(n+j-1)!}{(n-1)!(|h|+j)!}\left(\frac{d-\sqrt{d^2-4y(d-x+1)}}{2\sqrt{d^2-4y(d-x+1)}}\right)^j \end{align*}$$

I'm currently attempting to figure out where I'd even start to build a CDF function for this distribution. Functionally, we're looking at a discrete Negative Multinomial Distribution where $h$ can be any integer: $\infty > h > -\infty$. Unlike a more standard Negative Binomial distribution, I can't just sum the mass function from $0$ to $h$, as one normally would, and I've done a lot of reading but I can't figure out how one would generally build a CDF for this kind of discrete distribution. If this was a continuous variable, this wouldn't be a problem - I know that there are ways to integrate from $-\infty$ to $h$, but I'm pretty certain that transforming this into a continuous function won't actually help.

So, I ask, what approach can we use to calculate the cumulative distribution function for this result? To be specific, I'd like to convert the infinite sum in this CDF into either a closed form, or some finite sum that can be exactly calculated.

  • 2
    Suspect there's a typo'. For integers $h \in [-10,10]$, with $n = 4$, $d = 6$, $x = 2$, $y = 5$, the hypergeometric function is giving complex (frequently imaginary) values. This isn't so surprising, the argument to that ${}_2{}F_1$ is $25/9 \not\in [-1,1]$, so the value is likely not real. – Eric Towers May 26 '25 at 06:12
  • 1
    Also, why wouldn't the CMF be $\sum_{j = -\infty}^h P(H = j)$ just as for any other CMF on $(-\infty, \infty)$? – Eric Towers May 26 '25 at 06:13
  • 1
    ... or I swapped the role of $x$ and $y$ (because you haven't specified any constraints among the various parameters, like which are integers, which are positive, which are bounded, ...) – Eric Towers May 26 '25 at 06:16
  • Ya know, I probably should have bound those, yeah. That's an assumption that shouldn't be left an assumption. I'll go fix that. – Lee Davis-Thalbourne May 26 '25 at 08:16
  • Also, Yes, I'm sure that is what the CDF would be in general. But how would one calculate that? It's rather tricky to manually perform an infinite sum, and I definitely don't know how one would do so on a negative multinomial distribution! – Lee Davis-Thalbourne May 26 '25 at 08:26

1 Answers1

2

I still haven't figured out a full solution, but here's a solution for a simpler problem, and hopefully we can use this as a stepping stone to a full solution.

Let's assume that $d-x+1=y$, or in other words, assume that our negative and positive results have the same probability, which we'll call $\frac{\alpha}{d}$, and for which $0 < \alpha < \frac{d}{2}$. If that's the case, then the distribution should be balanced around zero - $P(H>0) = P(H<0)$. So, by simple algebra, in this context:

$$P(H>0) = P(H<0) = \frac{1-P(H=0)}{2}$$

And fortunately, $P(H=0)$ can be very easily reduced down in this circumnstance:

$$\begin{align*} P(H=0)&=\binom{n+0-1}{0}\left(\frac{d-2\alpha}{d}\right)^n\left(\frac{\alpha}{d}\right)^0\left(\frac{\alpha}{d}\right)^0 {}_2F_1\left(\frac{n}{2},\frac{n+1}{2},1,\frac{4\alpha^2}{d^2}\right)\\ &=\left(\frac{d-2\alpha}{\sqrt{d^2-4\alpha^2}}\right)^n\sum_{j=0}^{n-1}\binom{n-1}{j}\frac{(n)_j}{(1)_j}\left(\frac{d-\sqrt{d^2-4\alpha^2}}{2\sqrt{d^2-4\alpha^2}}\right)^j\\ &=\left(\frac{d-2\alpha}{\sqrt{d^2-4\alpha^2}}\right)^n\sum_{j=0}^{n-1}\binom{n-1}{j}\binom{n+j-1}{j}\left(\frac{d-\sqrt{d^2-4\alpha^2}}{2\sqrt{d^2-4\alpha^2}}\right)^j \end{align*}$$

Once we have $P(H=0)$, we can then use that value to get either the entire probability space of either the positive or negative values, and once we have that, we can then perform a standard finite sum to calculate the probabilities from $1$ to $h$ (in either direction), and add them to the total.

This won't work if the negative and positive outcomes have different probabilities though, as the probabilities would no longer be balanced - one side of the distribution would be more probable than the other, so halving $1-P(H=0)$ won't give us the correct result. In that case, you would have to find a clever way to Calculate $P(-\infty>H>0)$, which I've not yet been clever enough to figure out.