Mean number of unique choices given n people choosing randomly from a set of N elements

Question

My question is similar to the birthday problem, but I can't seem to find a simple solution. The question (in a general form) is that, given a set of $n$ people who each choose elements from a set of $N$ elements ($p = 1/N$), what is the expected number (mean $\mu_x = E(x)$) of unique destinations that will be chosen provided an infinite number of trials.

For a specific example, imagine a skyscraper with $N$ floors and an elevator full of people on floor 0 with capacity $n$. The question, then, would be how many different stops, on average, will the elevator have to make on its trip to the $N^{th}$ floor, if it stops at every floor where $\geq1$ people need to get out.

From solutions of the birthday problem, I know that there are $n(1-\frac{1}{N})^{n-1}$ unique destinations, and $n(1-(1-\frac{1}{N})^{n-1})$ people who share a destination. From this, shouldn't the expected number of unique destinations be $n(1-\frac{1}{N})^{n-1} + n(1-(1-\frac{1}{N})^{n-1})\cdot($mean # of people who have chosen the same destination given that at least 2 people have chosen it$)$? The problem (for me at least) is determining this last term.

Thanks!

I don't know what a "unique" destination is, but I think what you want to know is the mean number of *different* choices will be made by $n$ people choosing randomly from a set of $N$ elements. Did I guess right? — bof, Jan 20 '16 at 08:43
And what is that stuff about "provided an infinite number of trials"?? In an infinite number of trials, *all $N$* of the destinations will be chosen. — bof, Jan 20 '16 at 08:45
Anyway, if the question means what I think it means, then the probability that a particular destination gets chosen is $1-(1-\frac1N)^n$ so the expected number of chosen destinations is $N[1-(1-\frac1N)^n],$ I think. — bof, Jan 20 '16 at 08:49
Regarding your first comment, the phrase "unique" means the total number of different destinations, as multiple people can choose the same destination (i.e. with replacement). The "infinite trials" doesn't refer to an infinite $n$ people choosing, but rather infinite instances of a group of $n$ people making their random choices. While the distribution of choices varies from sample to sample, I'm looking for the theoretical value here. I think you've interpreted the problem correctly, but could you elaborate more on how you got your answer? I would think that the mean would be $np$, not $Np$? — Ash Johnson, Jan 22 '16 at 04:39
In short, you are using the word "unique" to mean "different". This is unnecessary, because the word "different" exists, and inadvisable, because in mathematics the word "unique" is often used to mean "unique". For example, "the equation has a unique solution", meaning that it has just one solution. It would make no sense to ask, "how many unique solutions does the equation have"; if it has more than one, it's not unique. — bof, Jan 22 '16 at 06:14

score 2 · Accepted Answer · answered Jan 22 '16 at 06:38

2

Let's use the method of indicator variables.

Let $X$ be the number of different destinations that are chosen; $X$ is a random variable whose range of values is $\{0,1,\dots,N\}.$

For each $i\in\{1,\dots,n\},$ let $X_i$ be the random variable which takes the value $1$ if at least one of the $n$ people picks the $i^{\text{th}}$ destination, $0$ if nobody picks it. Thus $X=X_1+\cdots+X_N.$

The probability that a given person, call him Joe, picks the $i^{\text{th}}$ destination is $\frac1N$; the probability that Joe does not pick the $i^{\text{th}}$ destination is $1-\frac1N$; the probability that nobody picks the $i^{\text{th}}$ destination is $(1-\frac1N)^n$; the probability that at least one person picks the $i^{\text{th}}$ destination, also known as $P(X_i=1),$ is $1-(1-\frac1N)^n.$ It follows that $$E(X_i)=0\cdot P(X_i=0)+1\cdot P(X_i=1)=P(X_i=1)=1-(1-\frac1N)^n$$ and $$E(X)=E(X_1+\cdots+X_N)=E(X_1)+\cdots+E(X_N)=N(1-(1-\frac1N)^n).$$

answered Jan 22 '16 at 06:38

bof

82,298

This is exactly what I was looking for, and it checks out with simulated random choices as well. Thank you! – Ash Johnson Jan 25 '16 at 05:37
Since everyone has to choose some destination, the probability that all $X_i = 0$ simultaneously is zero. Your model seems to assign non-zero probability to this... or does it? – alexis Mar 19 '16 at 18:24
@alexis $X=0$ has probability zero, except when $n=0,$ and then it has probability one. I guess I used the word "range" incorrectly? – bof Mar 19 '16 at 23:25
I was just wondering if failing to incorporate this constraint in the model gives the wrong probabilities. I vaguely remember that indicator variables miraculously get around some such problems, but I'm not sure here so I thought I'd ask. – alexis Mar 19 '16 at 23:31
@alexis If you have a question, I'll try to answer it. Which of the probabilities stated in my answer do you have doubts about? – bof Mar 20 '16 at 01:10
I can't quite put my finger on it... but since the outcomes are not independent of one another, is $E(X)$ really equal to $\sum_i E(X_i)$? – alexis Mar 20 '16 at 09:47
@alexis Are you asking, is $E(X+Y)=E(X)+E(Y)$, even when $X$ and $Y$ are not independent variables? If that's your question, the answer is yes. – bof Mar 20 '16 at 09:58
Right, I think that bypasses the problem that was nagging at me. Thanks. – alexis Mar 20 '16 at 11:32

Mean number of unique choices given n people choosing randomly from a set of N elements

1 Answers1