1

I am asked to solve a probability problem, but despite the solution is correct, I still think it can be improved, but I struggle to see how.

There exist two types of people: category $A$ that will pass a test with probability $p_a$ and category $B$ that will pass a test with probability $p_b$. There are $a$ people in category $A$ and $b$ people in category $B$. Let us also assume that the $p_a$ is independent of $p_b$ and vice versa. What is the probability that $c$ people will pass the test, given knowledge of $a$ and $b$?

This can otherwise be formulated as: given I know how many $a$ students there are and how many $b$ students there are, with their probabilities of passing the test, what is the probability that $c$ will pass? It can be solved by thinking how many ways can $c$ students be taken, such as $0$ from $a$ and $c$ from $b$, $1$ from $a$ and $c - 1$ from $b$ and so on.

I derived the following formula which seems correct:

$p(c | a, b) = \sum_{x=0}^a {a \choose x} (p_a)^x (1 - p_a)^{a - x} {b \choose c - x} (p_b)^{c - x} (1 - p_b)^{b - c + x}$

This does just what is described above: it checks every possible combination of students from the two categories; I assumed that it is always the case that $a < b$ for simplicity. However, when I plot this probability function, I observe a binomial distribution, which makes me wonder if there is an easier way to represent it, namely as a standard binomial distribution, such as

$p(c | a, b) = {\alpha \choose \beta} p ^ {\beta} (1 - p)^{\alpha - \beta}$

Roger May
  • 13
  • 2

1 Answers1

0

In general, the answer is that no this is not binomially distributed.

I will consider the case that $0 < p_a \neq p_b < 1$, so that the probabilities are not equal, and that neither distribution is degenerate (i.e. returns a non-random number). I comment on the other cases at the end.

Further, I will simplify to consider the case $A = B = 1$, though the argument holds for more general choices of $A,B$.

In this case $p(c) > 0$ if and only if $0 \leq c \leq 2$, and therefore if $c$ were to have a binomial distribution it would have to be of the form

$$p(c) = \binom{2}{c}q^c(1-q)^{2-c}.$$

Let us assume that $c$ does have this distribution, then its expectation would be $\mathbf E[c] = 2q$, and since $c = a+b$, by linearity of expectation we have $$ \mathbf E[c] = \mathbf E[a+b] = \mathbf E[a] + \mathbf E[b] = p_A + p_B,$$ which gives $$ q= \frac{p_A +p_B}{2}.$$

Now we can similarly assess the variance of $c$ which is given by $\text{Var}(c) = 2 q(1-q)$, and since variance is also linear given independent variables $a,b$, then

$$\text{Var}(c) = \text{Var}(a) + \text{Var}(b) = p_a(1-p_a) + p_b(1-p_b).$$

Combining this with the derived formula for $q$ this means

$$(p_a+ p_b)\left(1 - \frac{p_a + p_b}{2}\right) = p_a(1-p_a) + p_b(1-p_b),$$ which on rearranging gives

$$\frac12(p_a + p_b)^2 = p_a^2 + p_b^2.$$

However, the above can only be true if $p_a = p_b$, which we ruled out in our first assumption. Therefore $c$ is not binomially distributed.

Additional Comments

  1. In the case that $p_a = p_b$ then $c$ will be binomially distributed; a discussion of this is given here.
  2. In the case that $p_a = 0$ (or equivalently, $p_b$) then the sum will be binomially distributed, since $c$ will have the same distribution as $b$.
  3. In the case that $p_a = 1$, then $c$ will be a shifted binomial distribution; in that it will be equal to $A + b$, where $A$ is a constant and $b$ is binomial. i.e. $$p(c) = \binom{B}{c-A} p_B^{c-A} (1-p_B)^{A+B -c}.$$
  4. Whilst we have shown that $c$ will not be binomially distributed, it is an example of a Poisson Binomial distribution.
owen88
  • 4,660
  • Thank you, @owen88, your explanation was spot on. I originally followed a similar procedure to determine if it could be rearranged into a traditional binomial distribution, but I was unsuccessful thinking I made mistakes in the process. I was not aware of the existence of Poisson Binomials and I will look more into it. Thank you! – Roger May Nov 05 '18 at 09:18
  • Glad to have been of assistance; if you are happy with the answer, can I ask you to upvote / accept (click the tick) the post. – owen88 Nov 05 '18 at 10:50