2

Consider a group $G$ consisting of $n$ distinct elements. Suppose we draw random elements of $G$ (one by one, replacing each element every time) until we draw an element that was drawn before (we say there was a match or collision).

I am in front of the following statement

If elements are drawn at random from $G$ then the expected number of draws before a collision occurs is $\sqrt{πn/2}$

(I suppose it is implicit that all elements of $G$ have the same probability of being drawn). I tried and failed to understand this statement in terms of expectation of the variable that counts the number of draws before the collision occurs. Any ideas?

Octania
  • 404
  • calculate $\Bbb{P}[\text{ draws before colision }= k] = \frac{n}{n}\frac{n-1}{n}\ldots \frac{n-k}{n} \frac{k}{n} = \frac{n!}{(n-k-1)!}\frac{k}{n^{k+1}}$ you obtain $\Bbb{E}[X] = \sum_{k=0}^{n-1}\frac{n!}{(n-k-1)!}\frac{k^2}{n^{k+1}}$ I don't know if this is equal to $\sqrt{\pi n/2}$ – Conrado Costa Jul 03 '15 at 17:06
  • That square root of pi is interesting. Maybe some connection to a Gauss distribution? – mvw Jul 03 '15 at 17:06
  • @ConradoCosta but say $n=2$. The collision happens when drawing the 2nd or the 3rd element ($k=1$ or 2). Then $\mathbb{P}$[draws before collision $=1$] $=1/2$ and $\mathbb{P}$[draws before collision $=2$] $=1/2$. Where does $\sqrt{\pi}$ come from? – Octania Jul 03 '15 at 17:37
  • you are right, it has no $\pi$ do you have a reference for this result you are trying to prove? – Conrado Costa Jul 03 '15 at 17:43
  • I found it in Chapter 19 of the Handbook of Elliptic and Hyperelliptic Curve Cryptography, from Cohen, Frey et al. The result is not proved in the book and there is no particular citation either – Octania Jul 03 '15 at 17:55

1 Answers1

4

The probability that the first collision occurs at draw number $k+1$ is the probability that the first $k$ draws are distinct and the $k+1$'th is equal to one of the first $k$. The number of outcomes in this event is $\dfrac{n!}{(n-k)!} k$, so its probability is $$P_k = \dfrac{n! k}{(n-k)! n^{k+1}}$$

The expected number of draws before a collision occurs (before, so not counting the draw on which the collision occurs) is then $$ \sum_{k=0}^n \dfrac{n!\; k^2}{(n-k)!\; n^{k+1}} $$

There is no nice closed form for this sum (unless you like $\mbox{$_3$F$_1$}$ hypergeometrics), but of course for any particular $n$ you can compute it exactly: it will be a rational number, so certainly not $\sqrt{\pi n/2}$. The claim is that it is asymptotic to $\sqrt{\pi n/2}$ as $n \to \infty$.

EDIT: Let $X$ be the number of draws before the first collision. Then $E[X] = \sum_{k=1}^n P(X \ge k)$ where for $1 \le k \le n$, $$P(X \ge k) = \dfrac{n!}{(n-k)! n^k} = \prod_{j=0}^k \left(1 - \dfrac{j}{n}\right)$$ I won't bother with the details, but this can be approximated by $e^{-k^2/(2n)}$. Then $$ E[X] \approx \sum_{k=1}^n e^{-k^2/(2n)} \approx \int_0^n e^{-t^2/(2n)}\; dt = \sqrt{n}\int_0^{\sqrt{n}} e^{-s^2/2}\; ds \approx \sqrt{\pi n/2} $$ because $\int_0^\infty e^{-s^2/2}\; ds = \sqrt{\pi/2}$.

Robert Israel
  • 470,583
  • The expression $\sum_{k=0}^n\dfrac{n!k^2}{(n-k)!n^{k+1}}$ tends to 0 as $n\rightarrow \infty$. It is true that as $k$ increases it fits a bit better the plot of $\sqrt{\pi n/2}$. Still, I would like to know why does $\pi$ appear. https://www.desmos.com/calculator/2m6h1r7si0 – Octania Jul 03 '15 at 17:53
  • No, it goes to $\infty$ as $n \to \infty$. – Robert Israel Jul 03 '15 at 18:57
  • @Robert Israel would you mind going in the details of your approximation? – Conrado Costa Jul 04 '15 at 02:17
  • @RobertIsrael I see the mistake in my previous comment. The only thing left is to see how $P(X\geq k)$ is approximated by $e^{−k^2/(2n)}$... – Octania Jul 04 '15 at 11:25
  • is it like in this question: http://mathoverflow.net/questions/81472/a-product-approximation-to-the-taylor-series-of-the-exponential ? – Octania Jul 04 '15 at 11:26