2

Consider the standard balls into bins problem, with $n$ balls uniformly randomly thrown into $k$ bins.

What is the probability that exactly $b$ bins have at least $m$ balls?

This is a generalization of the standard problem where we want to know the probability that exactly $b$ bins have at least $1$ ball (non-empty) - I know that you can use stirling numbers of the second kind to derive a closed form solution for this. Is it possible to generalize this to the event of having at least $m$ balls?

ploosu2
  • 12,367

2 Answers2

2

Hint/Approach: A way to extend the definition of Stirling numbers are the "Associated Stirling numbers of the second kind", they are denoted by ${ n\brace k}_{\geq m},$ where this means the number of partitions of $[n]$ into $k$ blocks, each one having at least $m$ elements. So, most likely if you know that the formula involves taking Stirling numbers, you can change that for these associated. You can easily check that they satisfy the following recursion $${n+1\brace k}_{\geq m} = \sum _{j = m-1}^n\binom{n}{j}{n-j\brace k-1}_{\geq m},$$ by checking the elements that are in the same block as the element $n+1.$

Phicar
  • 14,827
  • Do you know if associated stirling numbers of the second kind have a closed form representation or if they can be represented in terms of normal stirling numbers? – Kiran Gopinathan Dec 18 '19 at 02:41
  • 1
    Not really, you can express associated with $m$ in terms of associated with $m+1$ so it goes up the dependency. I dont think there is a nice closed formula. By nice i mean not like adding all possible sizes with binomials. – Phicar Dec 18 '19 at 02:55
0

UPDATE 5.1.2025 Now I understood how to use the associated Stirling numbers. The answer is

$$ \frac{1}{k^n} \sum_{i=b}^k (-1)^{i-b} \binom{i}{b} \binom{k}{i} i! \sum_{l=im}^n \binom{n}{l} {l\brace i}_{\geq m} (k-i)^{n-l} $$


Old stuff

Let $G(u,v)$ be the (exponential in $u$, ordinary in $v$) generating function of the number of favourable outcomes (we get probability by dividing by $k^n$), $u$ recording the number of balls and $v$ the number of bins that have at least $m$ balls.

Denote

$$ h = \sum_{j=0}^{m-1} \frac{u^j}{j!}. $$

We have

$$ G(u,v) = \left(h + (\exp(u)-h)v\right)^k $$

because this is a $k$-sequence of bins and for each we decide whether its set of balls is smaller or bigger than $m$. We can expand by binomial theorem

$$ G = \sum_{j=0}^{k} \binom{k}{j} h^{k-j} (e^u-h)^{j}v^{j} $$

Now, again by binomial theorem $$ [v^b]G = \binom{k}{b} h^{k-b} (e^u-h)^{b} \\ = \binom{k}{b} h^{k-b} \sum_{j=0}^b \binom{b}{j} (-1)^j e^{u(b-j)}h^j \\ $$

and so by expanding $e^t = \sum_{j=0}^\infty \frac{t^j}{j!}$ we get that

$$ n![u^n][v^b]G = n! \binom{k}{b} \sum_{j=0}^b \binom{b}{j} (-1)^j \sum_{t=0}^n \frac{(k-j)^t}{t!} [u^{n-t}] h^{k-b+j} $$

The coefficient exctraction of a power of $h$ just needs to be done. For powers $2,3,4,5$ they are found in OEIS A24884 (link for $2$ the next entries for $3$-$5$).

I don't see how this simplifies to a formula using the associated Stirling numbers of the second kind. The case $m=1$ is simple because $h=1$.


UPDATE

Inclusion-Exclusion

The above used the symbolic method. But we can also use Inclusion-Exclusion as follows. Let's calculate the probability that there are less than $b$ bins containing at least $m$ balls.

Denote the events

$$ A_B = \{ \text{bins of B have less than m balls} \} \\ \text{for } B \in \mathcal B = \{ B \subset \{1,2,\dots,k\}, |B|=k-b+1 \} $$

Then the probability we want is by I-E

$$ P := \mathbb P \left( \left(\bigcup_{B \in \mathcal B } A_B \right)^C \right) = \sum_{J \subset \mathcal B} (-1)^{|J|} \mathbb P \left( \bigcap_{B \in J } A_B \right). $$

There are now two tasks: to partion the huge sum into more manegeable chunks and to calculate the probability of a particular intersection (where the probability is the same for all $J$'s in that chunk, this is what we'll be calling $p_u$). The same thing is done in this answer (the sets are a little different but the idea the same).

So the idea is to partition by $a = |J|$ and $u = |\bigcup_{B \in J} B|$.

Denote by $p_u = \mathbb P (\text{bins 1,2,...,u have less than m balls})$.

So we have (see the linked answer how to arrive at this, in particular the last update where the formula was simplified into single sum)

$$ P = \sum_{u=k-b+1}^k (-1)^{u-k+b+1} \binom{u-1}{u-1-k+b} \binom{k}{u} p_u $$

Then for the calculation of $p_u$. Here I think we have to use the generating function approach as we're bounding the sizes of the bins (see the OEIS link and its link to Marko Riedel's answer to that exact question). So, as previously (but use the variable $z$ since we took $u$ as summing index), let $h(z) = \sum_{j=0}^{m-1} \frac{z^j}{j!}$.

Let's condition on the event that $l$ balls in total go into the bins $1,2,\dots, u$ (and the rest $n-l$ balls go to the bins $u+1,\dots, k$), with $l=0,1,\dots, n$ to get

$$ p_u = \sum_{l=0}^n [z^l](h^u) \frac{l!}{u^l} \cdot \binom{n}{l} \left(\frac{u}{k}\right)^{l}\left(\frac{k-u}{k}\right)^{n-l} \\ = \frac{1}{k^n}\sum_{l=0}^n l! [z^l] (h^u) \binom{n}{l} (k-u)^{n-l} $$

Here's a Sage-code to do the calculation

def P(n,k,b,m):
    if b>k: return 0
    R.<z> = QQ[]
    h = sum(z^j/factorial(j) for j in range(m))
    good = 0
    for u in range(k-b+1, k+1):
        hToU = h^u
        PBu = sum( factorial(l)*hToU[l] * binomial(n,l)*(k-u)^(n-l) for l in range(n+1))
        good += (-1)^(u-k+b+1)*binomial(u-1,u-1-k+b)*binomial(k,u)*PBu
    return good/k^n

n,k,b,m = 10,5,4,2 print (P(n,k,b+1,m) - P(n,k,b,m)) #21168/78125

ploosu2
  • 12,367