I am considering the following probabilistic balls-into-bins model. There are $n$ bins and two types of balls. For each type, there are $\rho$ balls. Each ball independently lands in bin $i$ with probability $p_i$, where $\sum_{i=1}^{n} p_i = 1$.
Define the following random variables:
- Let $X_i$ be the number of type-1 balls in bin $i$.
- Let $Y_i$ be the number of type-2 balls in bin $i$.
- A collision in bin $i$ occurs if there is at least one ball of each type in bin $i$, i.e., if $X_i \geq 1$ and $Y_i \geq 1$.
- Let $C_i$ be the indicator random variable for a collision in bin $i$, so $C_i = \mathbf{1}_{X_i \geq 1, Y_i \geq 1}$.
I am interested in bounding the probability that there is no collision anywhere, i.e., $$ P\left( \sum_{i=1}^{n} C_i = 0 \right) = P(\forall i, C_i = 0), $$ with an upper bound $O(n^{-c})$ where $\rho = O(\sqrt{n \log n})$ (the constant $c$ is hidden inside $\rho$).
Challenges:
A natural approach is to use the independence of the bins. However, the random variables $C_i$ are not independent, since they depend on the shared total number of balls of each type. This prevents a straightforward product-bound.
A common technique for approximating such problems is the Poisson approximation (see e.g., Mitzenmacher & Upfal, Probability and Computing), which often gives a good upper bound up to a factor of 2. My goal is to use this approximation to obtain a tight upper bound, possibly followed by Jensen’s inequality or other methods.
Roadblock:
The usual Poisson approximation applies to a single balls-into-bins process, while here we have two dependent processes. One idea I had was to define an alternative single-process model:
- Consider $2\rho$ balls of one type.
- Instead of $n$ bins, use $2n$ bins, indexed by $\{1, \dots, n, n+1, \dots, 2n\}$.
- Place each ball in bin $i$ or bin $i+n$ with probability $0.5 p_i$, so that the probability of either bin $i$ or bin $i+n$ receiving a ball is still $p_i$.
- I define a collision in this model as bins $i$ and $i+n$ both containing a ball.
I thought that this new process may upper bound the original one, and I was hoping to use the Poisson approximation on it. However, I am unsure how to formally argue that this new model indeed provides the desired upper bound for the original process.
Question:
How can I rigorously bound $P(\forall i, C_i = 0)$ using the Poisson approximation or another technique? Specifically, is my proposed transformation into a single-process model valid for upper bounding the original probability? If not, is there another way to effectively apply the Poisson approximation in this setting?
so the expected number of bins with no collisions is $\sum\limits_i 2(1-p_i)^\rho-(1-p_i)^{2\rho}$,
so the probability all $n$ bins are empty must be between $1-n+\sum\limits_i 2(1-p_i)^\rho-(1-p_i)^{2\rho}$ and $\frac1n \sum\limits_i 2(1-p_i)^\rho-(1-p_i)^{2\rho}$. Unfortunately the first of these can easily be negative.
– Henry Apr 04 '25 at 16:54