Probability of every ball occurring in multiple independent random samples

Question

An urn contains 5 distinct numbered balls. You choose 2 without replacement. You then reset the urn and choose another 2 without replacement. Do this one more time. Now you have three random samples of size 2. What is the probability that all of the numbered balls appear at least once in your 3 random samples of size 2?

My thinking of a way to approach this is through complements. Finding the probability one of the balls is missing, two, and three. Then subtracting all of these scenarios from one. Is this a solid approach?

Here is the work:

P(all numbered balls appear at least once) = 1 - P(at least one ball is missing)

P(at least one ball is missing) = P(one ball missing) + P(two balls are missing) + P(three balls are missing)

I found the probabilities of one ball, two balls, and three balls missing to be the following:

P(one ball missing) = $$\left(1*{4 \choose 2}/{5 \choose2}\right)^2$$

P(two balls missing) = $$\left(1*{3 \choose 2}/{5 \choose2}\right)^2$$

P(three ball missing) = $$\left(1*{2 \choose 2}/{5 \choose2}\right)^2$$

This is because the first time you can choose any three balls, in the one ball missing case, you can only choose two balls from four when there are five possible balls. You can choose either the two that are original or any combination with a different arbitrary two but one must be left out. This is true for both random samples following the first. Any thoughts?

I would approach this via a probability state-transition model, aka Markov chains. It isn't clear from the little you've written how comfortable I might expect you to be with matrix multiplication, etc. needed to follow my approach. You will help your Readers to form a cogent response if you supply more context, including your own thoughts about how to approach this. — hardmath, Feb 01 '16 at 02:28
So counting by inclusion/exclusion, very good. This is simpler than what I was thinking, although with bigger numbers inclusion/exclusion tends to grow fairly tedious. Be careful to note that cases where two balls are missing have already been counted among the one ball missing, the three balls missing cases include the two balls missing. — hardmath, Feb 01 '16 at 02:40

hardmath · Accepted Answer · 2016-02-01T13:00:05.270

I encourage you to work through the details of your own approach. Here is my alternative.

After the first drawing (a pair of balls), we have exactly two balls (out of five possible) that have been drawn (and we can never get fewer than this in the end).

The second drawing can affect which balls have been seen. There are three possible outcomes as far as the number of different balls drawn in either the first or second drawing.

0. We might get the same two balls as we did the first time.

1. We might get one of the original two balls and a third ball not seen in the first drawing.

2. We might get neither of the original two balls, so two new balls from this second drawing (for a total of four seen after both drawings).

Given that there are $\binom{5}{2}=10$ ways the second drawing can occur, counting the number of $2$-subsets of the five (distinct) balls, it should be pretty easy to work out the probability for each of these three outcomes. Only one of the ten ways corresponds to case 0. Case 2. corresponds to $\binom{3}{2}=3$ of them. Thus case 1. has probability $6$ out of ten, or $0.6$.

If case 0. occurred in the second drawing, nothing that happens in the third and final drawing will result in getting all five balls. We will fail to get all five balls at least once.

If case 1. occurred in the second drawing, then we have a "fighting chance". We have seen three of the balls, and what is needed is to draw exactly the two balls in the third drawing that had not been seen before. As just pointed out, the chance of doing that in the third drawing is the same as case 0. in the second drawing: one chance in ten.

Finally if case 2. occurred in the third drawing, it will be a lot more likely to succeed in getting the one missing ball (we have two "chances" to get it). A little calculation shows that the probability of going from case 2. in the second drawing to getting all five balls in the end is: four chances in ten. (It might be easier to count how many ways there are to fail to get the last ball, namely six ways out of ten.)

Now put the possible paths to success together in a non-overlapping way. We could get all five balls in three drawings either by:

A. Getting case 1. in the second drawing and filling out the sample in the third drawing. Probability is $0.6 \times 0.1 = 0.06$.

B. Getting case 2. in the second drawing and sampling the missing fifth ball in the third drawing. Probability is $0.3 \times 0.4 = 0.12$.

Add the probabilities of these disjoint outcomes and you have the combined probability that all five balls will appear at least once in the three samples.

Let me outline how we can make this computation scale up to similar but more complicated situations. In the Markov chain approach, which we've sketched in words above, one identifies "states" that occur and the probabilities (when a drawing is made) of transitioning from one state to another. We then format all these probabilities into a state-transition matrix.

Here we have considered states based on how many balls have been seen. Originally no balls were yet seen, and it is possible that up to five balls will be seen (after three draws). So we'll label the six states $B_0,B_1,\ldots,B_5$ according to how many different balls have been seen.

The Reader will note that when a draw begins in state $B_0$, there is a probability of $1$ that a transition to state $B_2$ will occur. Similarly the transition probabilities from state $B_2$ are calculated above as follows:

$$ \begin{align*} B_2 &\to B_0 &: \; 0.0 \\ &\to B_1 &: \; 0.0 \\ &\to B_2 &: \; 0.1 \\ &\to B_3 &: \; 0.6 \\ &\to B_4 &: \; 0.3 \\ &\to B_5 &: \; 0.0 \end{align*} $$

If we compile all the transition probabilities into a matrix, where $M_{ij}$ is the chance of going from $B_i$ to $B_j$, then:

$$ M = \begin{bmatrix} 0 & 0 & 1.0 & 0 & 0 & 0 \\ 0 & 0 & 0.4 & 0.6 & 0 & 0 \\ 0 & 0 & 0.1 & 0.6 & 0.3 & 0 \\ 0 & 0 & 0 & 0.3 & 0.6 & 0.1 \\ 0 & 0 & 0 & 0 & 0.6 & 0.4 \\ 0 & 0 & 0 & 0 & 0 & 1.0 \end{bmatrix} $$

Letting $p_0,p_1,\ldots,p_5$ be the probabilities of states before a draw and $q_0,q_1,\ldots,q_5$ the probabilities after a draw. Then:

$$ \begin{bmatrix} p_0 & p_1 & p_2 & p_3 & p_4 & p_5 \end{bmatrix} M = \begin{bmatrix} q_0 & q_1 & q_2 & q_3 & q_4 & q_5 \end{bmatrix} $$

Given that the initial "probability" distribution of states (before any drawing) is $p_0 = 1$ and the rest zeros, it is not terribly hard to show that the distribution after three drawings is:

$$ \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} M^3 = \begin{bmatrix} 0.00 & 0.00 & 0.01 & 0.24 & 0.57 & 0.18 \end{bmatrix} $$

The final entry of this last probability distribution of states is our answer.

André Nicolas · Answer 2 · 2016-02-01T03:01:48.490

1

The following approach is feasible only because the numbers are very small.

Without loss of generality we may assume that on the first draw we got two specific balls, say #4 and #5. Then our desired event can happen in two ways: (i) We get two new balls in Draw 2, and the missing one in Draw 3, or (ii) We get one of balls #4 or #5 and a new one in Draw 2, and the remaining ones needed in Draw 3.

Case (i): The probability of two new balls in Draw 2 is $\frac{\binom{3}{2}}{\binom{5}{2}}$. Given that we got two new in Draw 2, the probability we pick up the last one needed in Draw 3 is $\frac{4}{\binom{5}{2}}$.

Case (ii): The probability of one new and one old in Draw 2 is $\frac{(3)(2)}{\binom{5}{2}}$. Given this happened, the probability of two new in Draw 3 is $\frac{1}{\binom{5}{2}}$.

The rest is a short calculation.

edited Feb 01 '16 at 03:01

answered Feb 01 '16 at 02:51

André Nicolas

514,336

Nice solution. My approach looks at forcing the bad case rather but follows a similar approach – Paul Loach Feb 01 '16 at 03:11
1

The "cases" approach works fine here. But, as I vaguely mentioned in the first sentence, it does not generalize nicely. If we had larger numbers, and more draws, things would get out of hand. – André Nicolas Feb 01 '16 at 03:16

Probability of every ball occurring in multiple independent random samples

2 Answers2