-1

I am randomly drawing numbers from 1 to 10 (each number with equal probability and with repetition) and I do this until I have drawn each number at least once. As I have found out in the meantime, thanks to this website, this is called the coupon collector's problem.

Specifically, I am looking for the formula to calculate the probability distribution of the percentage of numbers which have arrived so far after 10 draws. The extremes of 1 (only a single number came 10 times) and 10 (each number came exactly once) are extremely unlikely but how to calculate the probabilities for all 10 outcomes. So n=10 and I am looking at the probability of k=1, k=2, ..., k=10.

Gecko
  • 123

1 Answers1

1

Let us make is so that your nth attempt is your last attempt. Clearly, n $\geq$ 10. If n is 10, the probability of it being your last attempt is $ \frac{10!}{10^{10}}$.

If n > 10, It means you have 1 out the following 9 cases after 10 attempts: 1 distinct, 2 distinct, 3 distinct,...9 distinct numbers. In general, if you have k distinct numbers, the probability of having that from the start is $10C_k$ x $k!$ x $k^{10-k}$ x $\frac{1}{10^{10}}$.

The probability of making progress is if you choose something outside of the already chosen k, $$\frac{10-k}{10}$$ After you have n-1 attempts, you must have 9 distinct numbers. So you have made progress 10 - k - 1 times in n - 11 attempts (from 10). You have to make 10 - k to be done after n attempts.

Let’s define: $$g(h) = \frac{10-h}{10}$$ Take the product of g(h) as h goes from k to 9. This product times the different probabilities of you not making progress create the final probability. If you made progress 10 - k times in n - 10 steps, then you didn’t make any progress in n - 20 + k steps. Notice that each time you didn’t make any progress you may or may not have different probabilities of making no progress. That is, your probability of not making any progress is 1 - g(h) where h could be anything from k to 9. So you have 9 - k + 1 = 10 - k choices for where you would not wanna make progress. Indeed you have a lot of different outcomes to choose from. Namely, $(10 - k)^{n - 20 + k}$. Each outcome should lead you to a different overall probability. I believe the rest should be calculated once you have a value for n.