0

The problem I'm addressing is to find the probability mass function of the sum of $n$ i.i.d. Random Variables, each of them having a categorical distribution with outcomes $-1$, $0$ and $1$ with probabilities $p_1$, $p_2$ and $p_3$ respectively.

This can be seen as a generalization of a binomial distribution (which can be defined as the distribution of the sum of $n$ i.i.d. Bernoulli Random Variables) when the outcome $-1$ is also allowed (and therefore the sum could decrease).

I encountered this problem in the context of random walks, and I was able to find the solution inspired by this answer and also by the answers to this question I posted earlier, so I wanted to share it as an answer to my own question. I also would really appreciate if some else is able to come up with a better solution or a better explanation or mathematical formulation.

J. R. C.
  • 311

1 Answers1

1

Let $S$ be the random variable defined as:

$$S = \sum_{i=1}^{n} X_i $$

Where the Random Variables $X_i$ are i.i.d. categorical distributed with outcomes $-1$, $0$ and $1$ with probabilities $p_1$, $p_2$ and $p_3$ respectively.

Then the probability of obtaining $S=s$ can be found as the coefficient of $x^s$ in the expression $(p_1x^{-1}+p_2+p_3x)^n$. If we apply the multinomial theorem we obtain

$$ (p_1x^{-1}+p_2+p_3x)^n = \sum_{k_1+k_2+k_3=n} \binom{n}{k_1,k_2,k_3} (p_1^{k_1}p_2^{k_2}p_3^{k_3})x^{k_3-k_1}$$

where the sum goes over all combinations of non negative integers $k_1$, $k_2$ and $k_3$ such that their sum is $n$. Because we are only interested in the term with $x^s$, the indexes must also satisfy the constrain $k_3-k_1=s$, or equivalently: $$k_3 = k_1 + s$$ This, together with the condition that $k_3\ge0$, also imposes that $k_1+s\ge0$, which implies that the minimum value of $k_1$, which we can call $\check{k}_1$, can be written as

$$\check{k}_1=max(0,-s)$$

Also, replacing $k_3$ as a function of $k_1$, we have:

$$2k_1 + k_2 = n-s$$

By inspection of this equation (writing down a few examples for small values of $n-s$) is easy to see that the maximum value of $k_1$, that we can call $\hat{k}_1$, is

$$\hat{k}_1 = \left \lfloor{\frac{n-s}{2}}\right \rfloor$$

and $k_2$ is just

$$k_2 = n-s-2k_1$$

Putting everything together, we arrive to

$$ P(S=s;n,p_1,p_2,p_3) = \sum_{max(0,-s)}^{\left \lfloor{\frac{n-s}{2}}\right \rfloor} \binom{n}{k_1,\,n-s-2k_1,\,k_1+s} \,p_1^{k_1}p_2^{n-s-2k_1}p_3^{k_1+s}$$

J. R. C.
  • 311