4

As stated in title, there are $z$ things to pick from, and you get $y$ picks, with replacement. What's the probability of picking such that you get at least one of each of $x$ things? Assume $x \leq y$ and $x \leq z$, and order doesn't matter. It seems like it should just be an extension of stars and stripes or balls in boxes but I'm having trouble getting it right.

Stars and stripes seems like it would be choosing types for $y-x$ things, since $x$ are set, which means distributing $z-1$ bars in those $y-x$ things for $\binom{y-x+z-1}{z-1}$ for an overall probability of that of $z^y$, which I reduced to $\frac{(y-x+z-1)!}{(z-1)! (y-x)! z^y}$. However as $y$ goes to infinity that seems to go to $\frac{y^z}{z^y}$, when intuitively it should go to 1.

For boxes I'm not sure of how to represent "at least 1 of each," and the inverse is not really that simple either.

Thanks!

  • Does it matter which $x$ items? That is, are there $x$ items specified in advance that you want to get, or do you just want to get any $x$ different items among your picks? – paw88789 May 11 '15 at 10:21
  • The $x$ items are specified in advanced (would getting from there to any $x$ different items just be a factor of z choose x?) – colblitz May 11 '15 at 13:25
  • See my comment following the answer of Valentin. – user2661923 Sep 12 '23 at 02:14
  • You're asking about a probability without specifying a distribution. Usually, when people don't specify a distribution, they intend to imply that something is distributed uniformly; but here it's not quite clear what's uniformly distributed. The first sentence sounds as if each pick is equally likely to pick any of the $z$ things; but then it's unclear what you mean by "order doesn't matter". The answer by Valentin seems to interpret this to mean that it's in fact the (unordered) multisets of items you picked that are equiprobable. Please clarify the question. – joriki Jan 13 '24 at 21:05
  • @joriki Interesting point. My (subjective) bias is that when the problem composer omits the probability that a specific item was selected, and there are $~z~$ items, then at each of the $~y~$ selections, there is a $~\dfrac{1}{z}~$ probability that a specific item was selected. In fact, this is the (implicit) assumption that I made in my answer. – user2661923 May 19 '24 at 07:00
  • @joriki I also assumed (perhaps wrongly) that the original poster's phrase "order doesn't matter" intends that picking thing $~T_1~$ first, and then picking thing $~T_2~$ should be construed as the same as picking thing $~T_2~$ first, and then picking thing $~T_1.$ I agree that it is unclear, especially since my interpretation makes the "order doesn't matter" intent moot, this being a probability problem rather than an enumeration problem. I chalk this up (again perhaps wrongly) to the original poster's inexperience with these types of problems. – user2661923 May 19 '24 at 07:07
  • To the Original Poster Assuming that you agree with my interpretation of the problem in the last two comments, note that in a $~\displaystyle \frac{N}{D}~$ combinatorics::probability problem, the problem solver is free to make whichever choice (order of selection is either relevant or irrelevant) that is the most convenient, as long as this choice is consistently applied when computing both $~N~$ and $~D.~$ Typically, in a problem like this, the most convenient assumption, which greatly simplifies the computation of $~D~$ is that order of selection does matter. ...see next comment – user2661923 May 19 '24 at 07:22
  • To the Original Poster This is the strategy that I used in my answer, which allowed the very simple expression of $~D=z^y.$ – user2661923 May 19 '24 at 07:31

2 Answers2

0

Think about it in terms of balls and bins. You have to distribute $y$ identical balls among $z$ bins. $x$ predefined bins should contain at least one ball each (WLOG, first $x$ bins). You can firstly put one ball into each of the $x$ first bins, and then distribute remaining $y-x$ balls among $z$ bins. As you know from stars and bars, the total number of ways of doing it equals $z + y - x - 1 \choose z$. The total number of ways of distributing your balls is $z + y - 1 \choose z$. The answer is therefore ${z + y - x - 1 \choose z} / {z + y - 1 \choose z}$.

Valentin
  • 265
  • I could be mistaken. It seems that you are assuming that each Stars and Bars solution is equally likely. I suspect that that assumption is inaccurate. This is why Stars and Bars, while a good tool for Combinatorics problems, is not typically appropriate for Probability problems. Again, I could be mistaken. – user2661923 Sep 12 '23 at 02:13
0

I will use Inclusion-Exclusion here. See this article for an introduction to Inclusion-Exclusion. Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.

The answer will be

$$\frac{N}{D} ~: ~D = z^y. \tag1 $$

In (1) above, $~D = z^y~$ represents that you have $~y~$ decisions to make, and $~z~$ choices for each decision.

Index the things $~T_1, T_2, \cdots, T_x, T_{x+1}, \cdots, T_z.~$

So, the entire problem reduces to having $~N~$ denote the number of equally likely ways that each of $~T_1, \cdots, T_x~$ is chosen at least once.

Let $~S~$ denote the entire collection of distributions of the $~z~$ things which are picked by $~y~$ possible decisions. So $~|S| = z^y.$

For $~k \in \{1,2,\cdots,x\},~$ let $~S_k~$ denote the subset of $~S~$ where thing $~T_k~$ is not selected. For example, each element in $~S_1~$ represents $~y~$ selections of the $~z~$ things, where thing $~T_1~$ is not selected, and any of $~T_2, \cdots, T_x~$ may or may not have been selected.

Then the desired enumeration is

$$N = |S| - |S_1 \cup S_2 \cup \cdots \cup S_x|. \tag2 $$


Considerations of Symmetry are going to greatly simplify the application of Inclusion Exclusion here.

  • Let $~T_0 = |S| = z^y.$

  • Let $~T_1 = \sum_{1 \leq i_1 \leq x} |S_{i_1}|.$
    In other words, $~T_1~$ represents the sum of $~\displaystyle \binom{x}{1}~$ terms.
    By considerations of symmetry
    $~\displaystyle T_1 = \binom{x}{1} \times |S_1| = \binom{x}{1} \times (z-1)^y.~$
    That is, when computing $~|S_1|,~$ there are $~(z-1)~$ equally likely choices for each of the $~y~$ decisions.

  • For $~r \in \{2,3,\cdots,x\},~$
    let $~T_r = \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq x} |S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r}|.$
    In other words, $~T_r~$ represents the sum of $~\displaystyle \binom{x}{r}~$ terms.
    By considerations of symmetry similar to the considerations in the computation of $~T_1,$
    $~\displaystyle T_r = \binom{x}{r} \times (z-r)^y.~$
    That is, there are $~\displaystyle \binom{x}{r}~$ terms in the computation, and each term represents that $~r~$ specific items from $~T_1, \cdots, T_x~$ were excluded. So, for each term, there are $~(z-r)~$ equally likely choices for each of the $~y~$ decisions.

By Inclusion-Exclusion theory, the expression in (2) above is equivalent to

$$N = \sum_{r=0}^x (-1)^r T_r$$

$$= \sum_{r=0}^x (-1)^r \binom{x}{r} (z-r)^y.$$


$\underline{\text{Addendum}}$

See also the comments that I left, in response to the comment of joriki, following the posted question. These comments discuss the implicit assumptions that I made, in this answer.

user2661923
  • 42,303
  • 3
  • 21
  • 46