10

Given 4 different items, each with different chance of being selected, select an item, replace it and select another.

What is the expected mean number of selections that is required to select at least one of each item.

Through some research, this appears to be a variation of the coupon collector's problem where each of my "coupons" doesn't have the same probability.

From the final formula this post, I deduced that the average number of draws is given by:

$$ n\sum\limits_{j=1}^n {1\over j} $$

How can I amend this for items of unequal probability of selection please?

From the partial answer from @John:

$p_1 = 1$
$p_2 = p_a (p_b + p_c + p_d) + p_b (p_c + p_d + p_a) + p_c (p_d + p_a + p_b) + p_d (p_a + p_b + p_c)$

Am I then right in thinking $p_3$ would be $ p_ap_b(p_c + p_d) + p_ap_c(p_b + p_d) + p_ap_d(p_b + p_c) + p_bp_c(p_a + p_d) + p_bp_d(p_a + p_c) + p_cp_d(p_a + p_b) $

and $p_4$ would be $ p_ap_bp_c(p_d) + p_ap_bp_d(p_c) + p_ap_cp_d(p_b) + p_bp_cp_d(p_c) $ simplified to $4(p_ap_bp_cp_d)$


I put these values into a spreadsheet, but there's something I've done wrong. For the purpose of testing, I gave my coupons equal probability where I know the result should be 8.33. However, I'm getting the result 71.67 and can't see what is wrong. Any suggestions please?

enter image description here

RobPratt
  • 50,938
James Webster
  • 247
  • 2
  • 13

3 Answers3

3

$\def\m{\boldsymbol\mu}\def\p{\mathbf p}$

The expected value in question reads: $$ \mathbb E=\sum_i\frac1{p_i}-\sum_{(i,j)}\frac1{p_i+p_j}+\sum_{(i,j,k)}\frac1{p_i+p_j+p_k}-\cdots-\frac{(-1)^n}{\underbrace{p_1+p_2+\cdots+p_n}_{=1}}.\tag1 $$

It can be written in a compact notation as: $$ \mathbb E=\sum_{\m\ne0}\frac{(-1)^{|\m|-1}}{\m\cdot\p},\tag2 $$ where $\m$ are $2^n$ binary vectors of the length $n$ with $|\m|=\sum_{i=1}^n\mu_i$, and $\p=(p_1,p_2,\dots, p_n)$. The summation in (2) runs over all $\m$ except for $(0,0,\dots,0)$.

user
  • 27,958
0

notation:

$E_{ab|abcd}$ is time needed to collect items $a$ and $b$ from set of items $a\, b\, c\, d$ and it assumes that you have already collected items $c$ and $d$

$$E_{abcd|abcd} = 1 + p_aE_{bcd|abcd} + p_bE_{acd|abcd} + p_cE_{abd|abcd} + p_dE_{abc|abcd}$$

$$E_{abc|abcd} = 1+p_aE_{bc|abcd} + p_bE_{ac|abcd} + p_cE_{ab|abcd} + p_dE_{abc|abcd} => \\E_{abc|abcd} = 1/(1-p_d) * (1+p_aE_{bc|abcd} + p_bE_{ac|abcd} + p_cE_{ab|abcd})$$

here $E_{abc|abcd}$ on right means that you can pick item $d$ (and you already have it) and it will add 1 to your expected value of moves $$E_{ab|abcd} = 1+ p_aE_{b|abcd} + p_bE_{a|abcd} + (p_d+p_c)E_{ab|abcd} => \\ E_{ab|abcd} = 1/(1-p_d - p_c) * (1 + p_aE_{b|abcd} + p_bE_{a|abcd})$$

$$E_{b|abcd} = 1/p_b$$

you get the idea, I don't know how to simplify this expression...

quester
  • 627
0

Take it back to the basics as stated in the Wikipedia article.

You only have four things to pick from, so you can calculate each step explicitly. This equation follows the notation in the article: $T$ is the time to collect all of the items, and $t_i$ be the time to collect the $i$-th item after $i − 1$ items have been collected.

$$E(T) = E(t_1) + E(t_2) + E(t_3) + E(t_4) = {p_1}^{-1} + {p_2}^{-1} + {p_3}^{-1} + {p_4}^{-1}.$$

The probability $p_1$ of picking a new one if you have picked none yet is $1$.

The probability $p_2$ of picking a new one if you have picked one depends on which one you picked first. Let's call the items $A$ through $D$ and the probabilities of picking each item if they're all in the box $p_a$ through $p_d$.

Then $p_2$ would be $$p_a (p_b + p_c + p_d) + p_b (p_c + p_d + p_a) + p_c (p_d + p_a + p_b) + p_d (p_a + p_b + p_c).$$

The above expression is the probability of picking a new one given that $A$ had been picked already, plus the probability of picking a new one given that $B$ had been picked already, and so on.

The expression for $p_3$ will have six terms in the sum; one of these will be $p_a p_b (p_c + p_d$). The expression for $p_4$ will have four terms in the sum (actually, it's the same term four times!)

John
  • 26,582