3

Given an urn with $M$ unique balls, how many times do I need to draw with replacement before the probability that I have seen each ball at least once is greater than $\epsilon$?

nispio
  • 462

3 Answers3

1

Any sequence of $n$ drawings results in a word $w$ of length $n$ over the alphabet $[M]$. There are $M^n$ such words.

How many of these words $w$ are admissible, meaning that $w$ contains each letter $\ell\in [M]$ at least once? Any admissible word can be fabricated in the following way: Choose a partition of $[n]$ (the set of positions for the $n$ letters to be written) into $M$ nonempty blocks and assign to each block one of the letters $1$, $2$, $\ldots$, $M$. There are $$\left\{\matrix{n\cr M\cr}\right\}\tag{1}$$ ways to choose this partition, where $(1)$ denotes a so-called Stirling number of the second kind, and there are $M!$ ways to assign a letter $\ell$ to each of the $M$ blocks. It follows that there are $M!\left\{\matrix{n\cr M\cr}\right\}$ admissible words. Therefore the probability $p_n$ that a random word of length $n$ is admissible is given by $$p_n=\left\{\matrix{n\cr M\cr}\right\}{M!\over M^n}\ .$$ In order to determine the minimal $n$ for which $p_n>\epsilon$ one would need estimates for the Stirling numbers $\left\{\matrix{n\cr M\cr}\right\}$.

  • Can I assume that the two occurences of $r$ in the solution should actually be $M$? – nispio Aug 20 '13 at 19:49
  • Thank you. This is the answer that I have been looking for. I had done enough exploring to decide that there was a recursive relationship that also had an alternating sign $(-1)^n$. Looking back at my notes, I had independently discovered Stirling Numbers of the Second Kind, without knowing that such a thing existed. The only reason that I had not posted my solution here yet was that it was ugly, and I had not sufficiently proven the result to myself. – nispio Aug 20 '13 at 20:20
  • @nispio: Of course it's $M$; the $r$ came from an earlier version of my solution. – Christian Blatter Aug 21 '13 at 05:18
0

You can see this as a combinatorial problem.

There are $M^k$ possible run of $k$ draw each with the same probability (the draw are independent).

For a run we have seen each balls at least once if we can choose $M$ indexes such that for those indexes the balls are different. There are ${k\choose M}$ such set of indexes possible, and for each one $M!$ possibility to arrange the balls in those indexes.

Hence the number of run of length k that have seen all the balls at least once is $${k\choose M}M!$$ it follows that the probability to have such a run is ($0$ for $k<M$ and for $k\geq M$): $$P(X=k)=\frac{{k\choose M}M!}{M^k}=\frac{\frac{k!}{M!(k-M)!}M!}{M^k}=\frac{k!}{M^k(k-M)!}$$ So if you want $P(X=k)\geq \epsilon$ you have to sole $$\frac{k!}{M^k(k-M)!}\geq\epsilon$$ It should be doable :)

I hope it helped.

wece
  • 2,932
  • It seems like "k choose M" will give me the number of ways to choose the positions of the M unique balls within a string of k draws, and that M! gives all of the permutations within those M positions, but what of the other balls? For example, given M=3, k=5, is not the string "1 2 3 1 1" different than "1 2 3 2 2"? My three unique balls stay in the same positions, but the other positions are filled differently, meaning that it represents a different outcome that is not accounted for. – nispio Mar 21 '13 at 17:48
  • You are right, but if we count "1 2 3 1 1" different than "1 2 3 2 2" you will count to much because remark that for "1 2 3 1 1" also will be counted for "_ 2 3 1 " and " 2 3 _ 1". I'm not really sure of how to justify that (it may be false, but I thinks it's right). How I see this, is c=kind of like in probabilities if you property is $a\vee b$ once you have $a$ you don't care about the probability of $b$. I'll try to figure out an actual proof of this. did you try this result on some example? (I did on very simple ones and it seemed right) – wece Mar 22 '13 at 14:22
  • And i messed up the $k^M$ it's of course an $M^k$. edited – wece Mar 22 '13 at 14:23
0

Order the $M$ balls form $1$ to $M$ and suppose you have drawn $n$ balls with replacement. Among the $n$ balls, the number of ball$1$ is $x_1$, the number of ball$2$ is $x_2,\ \cdots$, the number of ball$M$ is $x_M$, then $$ x_1+x_2+\cdots +x_M=n $$ you want to see each ball at least once, that is the restiction: $$ x_i\ge 1,\ for\ 1\le n\le M $$ we have ${n-M \choose M-1}$ different choices, so I claim: $$ P(the\ desired\ event)={n-M \choose M-1}\cdot \frac{1}{M^n} $$

Coiacy
  • 1,680
  • Can you share your justification of "n-M choose M-1"? I would guess that you are positing that if each of the M balls has been represented exactly once, there are still n-M positions to fill. However, this would not take ordering into account, whereas M^n does take order into account. Can you clarify? – nispio Mar 21 '13 at 18:01
  • @nispio Sorry, you are quite right that I made a big mistake here. You need refer to combinatorics, and I'm sure this can be solved using theorems in that area. – Coiacy Mar 22 '13 at 04:04