1

Suppose I have $k$ balls, $n$ urns, and the urns are grouped into $2r+1$ sets where each set contains $\frac{n}{2r+1}$ urns. Assume that $n$ is a multiple of $2r+1$. (The choice of $2r+1$ rather than $r$ is just to emphasise that the number of sets it odd).

I want to calculate the probability that I randomly place all $k$ balls in the urns (each urn can contain at most one ball), such that there are no balls in at least $r+1$ of the sets of urns.

Additionally I am interested in calculating the following partition function: $$ Z = \sum_{k=0}^n \left[ e^{-bk} \times (\text{number of instances of none of the } k \text{ balls are in } r+1 \text{ sets of urns}) \right] $$ for some $O(1)$ constant $b$.

  • I'm a bit confused. If each urn can contain at most one ball, then the probability that you place all $k$ balls in one of the urns is zero unless $k\le1$. I guess you just want to place them in urns - such that $r+1$ sets are empty? – Matija Apr 10 '24 at 21:32
  • Poor phrasing: I mean I want $k$ balls to be placed in an urn (not the same urn). Sorry for the confusion! – Hans Schmuber Apr 12 '24 at 14:53

1 Answers1

1

Denote $R=2r+1$ and $m = \frac{n}{R}$ and $B={R \choose r+1}$.

Let's do inclusion-exclusion with the sets

$$ B_A = \{\text{all urns in groups in A are empty}\}, \text{ for } A \subset [R] \text{ with } |A|=r+1. $$

We want to find

$$ p := \mathbb{P}\left(\bigcup_A B_A \right) = \sum_{\emptyset \neq J \subset \{B_A\}} (-1)^{|J|+1}\mathbb{P} \left(\bigcap_J B_A \right) $$

but this sum is huge: there are $2^{R\choose r+1}$ terms. We must gather them somehow. This is achieved by considering how many sets are in the intersection, say $a$, and the size of their union $u$:

$$ p = \sum_{a=1}^{B} \sum_{u=0}^R (-1)^{a+1} g(a,u) \frac{n-um \choose k}{n\choose k} $$

where $g(a,u)$ is the number those $J$'s. That can be calculated with another inclusion-exclusion. Or, first note that it is disjoint union of all the possible what set of size $u$ the union is. So

$$ g(a,u) = {R \choose u} \left|\{ S \subset \{V\subset [u], |V|=r+1\}, |S|=a \} \right| $$

To calculate the size of this one representative (where we fix the union to be $[u]$), include-exclude on the sets missing a particular element of $[u]$, to get

$$ g(a,u) = {R \choose u} \sum_{c=0}^u (-1)^c {u\choose c} { {u-c \choose r+1} \choose a}. $$

So we have

$$ p =\frac{1}{n\choose k} \sum_{a=1}^{B} \sum_{u=0}^R \sum_{c=0}^u (-1)^{a+1} (-1)^c {u\choose c} { {u-c \choose r+1} \choose a} {n-um \choose k} $$

Luckily the big sum over $a$ is just an alternating partial sum of binomial coefficients and can be simplified. (NOTE: the sum starts at $a=1$, but actually any constant we subtract from the result will get cancelled in the sum over $c$.) We get

$$ p = \frac{1}{n \choose k} (-1)^{B+1} \sum_{u=r+1}^R \sum_{c=0}^u (-1)^c {u \choose c} {n-um \choose k} {R \choose u} { {u-c \choose r+1} - 1 \choose B} $$

Here's a Desmos-visualization and a Sage-code, also a dynamic programming way included, but for big values it's slower than the formula. For really big values, we should use approximation for the binomial coefficients. And I wonder if there's some general approximation for how the curve goes from $1$ to $0$ between $r$ and $rm+1$.

UPDATE (1.6.2024) This simplifies further

The binomial coefficient ${{u-c \choose r+1} - 1} \choose B$ is non-zero only when ${u-c \choose r+1} = 0$ i.e. $u-c < r+1$ and then it is $(-1)^B$. So we get (using the alternating partial sum formula again, now for the sum over $c$)

$$ p = \frac{1}{n \choose k} \sum_{u=r+1}^R \sum_{c=u-r}^u (-1)^{c+1} {u \choose c} {n-um \choose k} {R \choose u} \\ = \frac{1}{n \choose k} \sum_{u=r+1}^{R}\left(-1\right)^{u-r+1} \frac{u-r}{u}{u \choose u-r}{n-um \choose k} {R \choose u} \\ = \frac{\left(n-k\right)!R!}{n!r!}\sum_{u=r+1}^{R}\left(-1\right)^{r-u+1} \frac{\left(n-um\right)!}{u\left(u-r-1\right)!\left(n-um-k\right)!\left(R-u\right)!} $$

ploosu2
  • 12,367
  • This is incredibly useful! Thank you so much! – Hans Schmuber Apr 18 '24 at 15:49
  • 1
    @HansSchmuber You're welcome. This was a nice problem. Btw, we could still try to simplify the sum, for example using the Egorychev method: https://en.wikipedia.org/wiki/Egorychev_method , but that "double binomial coefficient" complicates it. – ploosu2 Apr 19 '24 at 09:47
  • 1
    Getting back to this because I faced the same "sub inclusion exclusion" (the formula for $g$) in another problem. Now looking it with new eyes, I started to wonder how can that binomial coefficient have the downstairs always bigger than upstairs?!? Isn't it then always $0$? Well, turns out the cases when the top is $-1$ gives the contribution $(-1)^B$. The formula then simplifies further. – ploosu2 Jun 01 '24 at 12:52