1

Say there are N types of coupons. Each time we obtain a new coupon, it has an equal probability of being any one of these types. I am interested in finding the probability that it takes n coupons to receive all types.

My approach: We require the first n-1 coupons to to have exactly N-1 types, with the Nth type received on the nth coupon. To find the probability of such an event, I believe we have N possible "missing" coupon types from the first n-1 coupons. We can then multiply by the probability: $$P(C_1 C_2 \dots C_{N-1})$$ where $C_i$ represents the probability that there is at least one coupon of type i in the first n-1 coupons. I wanted to calculate this probability using the inclusion-exclusion principle:

$$1 - P(C_1' \cup C_2' \cup \dots \cup C_N')$$

$$ = 1 - \left[ \binom{N-1}{1} \left( \frac{N-2}{N} \right)^{n-1} - \binom{N-1}{2} \left( \frac{N-3}{N} \right)^{n-1} + \dots + (-1)^{N-1} \left( \frac{1}{N} \right)^{n-1} \binom{N-1}{N-2} \right] $$

$$= 1 - \sum_{i=1}^{N-2} \binom{N-1}{i} \left( \frac{N-i-1}{N} \right)^{n-1} (-1)^{i+1}$$

The expression starts with $$\left( \frac{N-2}{N} \right)^{n-1}$$, because we cannot choose the last coupon type as well.

Finally, I compute $$(1 - P(C_1' \cup C_2' \cup \dots \cup C_N')) * N * 1/N$$, where the N again represents the possible missing coupons, and 1/N is the probability that the last coupon is the missing one.

I am getting an incorrect numerical answer with this approach, could someone help by pointing out what is exactly missing or wrong about my answer. Thanks!

RobPratt
  • 50,938
Ari1029
  • 23
  • When you see my answer, you will notice that I did not attempt to directly diagnose your work. Instead, I simply provided long-winded step-by-step analysis for your review. Contrast my style with yours. It seems to me that if you want a reviewer to directly diagnose your work, then you should adopt a style that is easier to digest. – user2661923 Nov 03 '24 at 22:51

1 Answers1

2

I will follow in the OP's (i.e. original poster's) footsteps, because I regard their approach as both valid and viable. So, the OP's question will be answered by having the OP comparing their work with my step-by-step work.

Index the coupons Coupon-1, Coupon-2, ..., Coupon-N.

Since any of the coupons can be the last coupon taken, reserve the factor of $~\displaystyle \binom{N}{1},~$ and then assume that in the first $~(n-1)~$ coupons, that all coupons except Coupon-N have been acquired.

So, now with the factor of $~\displaystyle \binom{N}{1},~$ reserved, you are computing the probability that each of the following events have occurred.

  • None of the first $~n-1~$ coupons obtained are Coupon-N.
    The probability of this occurring is $~\displaystyle \left[ ~\frac{N-1}{N} ~\right]^{n-1}.~$

  • The $~n$-th coupon obtained is Coupon-N.
    The probability of this occurring is $~\dfrac{1}{N}.~$

  • Given that none of the first $~n-1~$ coupons obtained are Coupon-N, that each of Coupon-1, Coupon-2, ..., Coupon-(N-1) were obtained among the first $~n-1~$ coupons. I will express this probability as $~\dfrac{A}{B},~$ which will be explained later.

Therefore, the final computation will be

$$\binom{N}{1} \times \left[ ~\frac{N-1}{N} ~\right]^{n-1} \times \frac{1}{N} \times \frac{A}{B}. \tag1 $$

So, the problem reduces to computing the fraction $~\dfrac{A}{B}.$


Inclusion-Exclusion is certainly feasible here. See this article for an introduction to Inclusion-Exclusion. Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.

Following the syntax in the second link, directly above:

  • Let $~S~$ denote the set of all possible (equally likely) ways of obtaining $~(n-1)~$ coupons.

  • For $~k \in \{ ~1,2,\cdots,N-1\},~$
    let $~S_k~$ denote the subset of $~S~$ that represents all possible ways of obtaining $~(n-1)~$ coupons, where Coupon-k is never obtained.

Then, the computation of $~\dfrac{A}{B}~$ may be represented by :

$$A = |~S~| - | ~S_1 \cup S_2 \cup \cdots \cup S_{N-1} ~|, ~~~~B = (N-1)^{n-1}. \tag2 $$

So, the entire problem reduces to computing $~A.~$


Let $~T_0~$ denote $~\displaystyle | ~S ~| \implies T_0 = (N-1)^{n-1}.$

Let $~T_1~$ denote $\displaystyle \sum_{1 \leq i_1 \leq N-1} | ~S_{i_1} ~|.~$
That is, $~T_1~$ represents the sum of $~\displaystyle \binom{N-1}{1}~$ terms.
By considerations of symmetry, each term equals $~([N-1]-1)^{n-1}.~$
Therefore, $~\displaystyle T_1 = \binom{N-1}{1} \times ([N-1]-1)^{n-1}.$

For $~r \in \{2,3,\cdots,N-1\},~$
let $~T_r~$ denote $\displaystyle \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq N-1} | ~S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r} ~|.~$
That is, $~T_r~$ represents the sum of $~\displaystyle \binom{N-1}{r}~$ terms.
By considerations of symmetry, each term equals $~([N-1]-r)^{n-1}.~$
Therefore, $~\displaystyle T_r = \binom{N-1}{r} \times ([N-1]-r)^{n-1}.$

Then, by Inclusion-Exclusion theory, the expression for $~A~$ shown in (2) above is equivalent to

$$\sum_{r=0}^{N-1} (-1)^r T_r = \sum_{r=0}^{N-1} (-1)^r \left[ ~\binom{N-1}{r} \times ([N-1]-r)^{n-1} ~\right]. \tag3 $$


Putting (1), (2), and (3) above together, the final computation is

$$\binom{N}{1} \times \left[ ~\frac{N-1}{N} ~\right]^{n-1} \times \frac{1}{N} \times \frac{1}{(N-1)^{n-1}}$$

$$\times \sum_{r=0}^{N-1} (-1)^r \left[ ~\binom{N-1}{r} \times ([N-1]-r)^{n-1} ~\right]$$

$$= \left[ ~\frac{1}{N} ~\right]^{n-1} \times \sum_{r=0}^{N-1} (-1)^r \left[ ~\binom{N-1}{r} \times ([N-1]-r)^{n-1} ~\right].$$

user2661923
  • 42,303
  • 3
  • 21
  • 46