1

I stumbled upon the birthday paradox, and I get it.

However, all the explanations I see solve the probability by subtracting to 1 the probability of all people having different birthdays. What I am interested in, however, is the untold: what is the formula to "directly" compute the probability of having at least two people with the same birthday in a group of n people? I cannot find it and the ones I can think of are not correct, and I really do not know what I am missing.

Any help is really appreciated!

rusiano
  • 155
  • 5
  • 3
    That depends on what you mean by "the" direct way. There are several I can think of. The easiest in my opinion calculation-wise will be numbering the people and breaking apart based on who the earliest number is who is the second to have a given birthday. $\frac{1}{365}+\frac{364}{365}\cdot\frac{2}{365} + \frac{364}{365}\cdot\frac{363}{365}\cdot\frac{3}{365}+\dots$. Another option might be calculating based on the types of groupings of people that can occur. This other approach though sounds horrendously tedious listing and working with the cases, though each individual case... – JMoravitz Jan 07 '22 at 20:51
  • Why should there be a "direct" formula? – MasB Jan 07 '22 at 20:51
  • ... is easy to calculate as it simply follows the multinomial theorem. For instance with 10 people, the chance of a triple sharing a birthday, a pair sharing a birthday, and five individual birthdays would be $\dfrac{\binom{10}{3,2,1,1,1,1,1}365\cdot 364\cdot \binom{363}{5}}{365^{10}}$. Note how there are many cases here, it isn't as simple as "there is a pair with a birthday" as it might be multiple pairs, or a pair and a triple, or multiple triples, etc... Further, neither of these sums are particularly easy to simplify however, and so the standard approach is commonly used. – JMoravitz Jan 07 '22 at 20:53
  • What's "indirect" about subtracting from $1$? – fleablood Jan 07 '22 at 20:54
  • Seems very messy to me. Superficially, for $n \in \Bbb{Z_{\geq 2}}$ you want $$\sum_{k=2}^n f(k),$$ where $f(k) =$ the probability of exactly $k$ people having the same birthday. However, defining $f(k)$ seems very challenging. For one thing, you have to clearly establish disjoint possibilities. For another, if (for example) $n = 5$, how do you characterize the situation where person-1 and person-2 have the same birthday while person-3 and person-4 also have the same birthday, which is different from that of person-1 and person-2. – user2661923 Jan 07 '22 at 20:55
  • @user2661923 you fix that by the specific compositions as alluded to in my second comment above and using multinomial theorem. – JMoravitz Jan 07 '22 at 20:56
  • @JMoravitz Yes, but as you indicate in your comment, this can get very messy, especially when you are trying for a generic formula for $n$, rather than a computation for a specific value of $n$, such as $n = 10$. – user2661923 Jan 07 '22 at 20:59

1 Answers1

1

Probably the simplest approach is to consider how many days are somebody's birthday. If this is $b$, you want to add up the probabilities for $1\le b \le n-1$ which with $d=365$ is $$\sum_{b=1}^{n-1} \frac{d! \,S_2(n,b)}{(d-b)! \,d^n} $$ where $S_2(n,b)$ represents Stirling numbers of the second kind. This gives the same answer as more conventional $1-\frac{d!}{(d-n)! \,d^n}$

For example, if $n=5$, this would suggest

$$\frac{365\times 1 }{365^5}+\frac{ 365\times 364\times15 }{365^5}+\frac{ 365\times 364\times363\times 25 }{365^5}+\frac{365\times 364\times363\times362\times 10 }{365^5}$$

which gives the same answer as $1 - \frac{365}{365}\times \frac{364}{365}\times \frac{363}{365}\times \frac{362}{365}\times \frac{361}{365}$, both being $\frac{175793709365}{6478348728125}\approx 0.027$

Henry
  • 169,616
  • Thank you @Henry. I am not completely able to follow you...

    First things first. So b is an index that scans all (the birthdays of) the participants so that we sum, for each person, the probability that his birthday is equal to someone else's, right?

    – rusiano Jan 10 '22 at 14:16
  • @rusiano No - $b$ here is the number of days in which at least one person has a birthday, not the number of people who share a birthday or the number of days which have shared birthdays (both of which are more complicated to deal with as probabilities, though not difficult as expectations). If $b<n$ then clearly some people must share a birthday. – Henry Jan 10 '22 at 17:07