A lot of questions about the birthday problem can be found here, but none seems to address my problem:
Background
I am thinking of a hash-type data structure design which accepts a certain number of collisions to occur. Collisions shall be detected and handled in a second data structure with substantially lower collision probabilities. The number I'm interested in is the number of datasets which go into the second data structure.
Question
I do not care if there are 4 'children' having birthday at the same 'day' or 2 pairs of children where each pair shares a certain birthday. Both would be counted as 4 children being involved in collisions. Also I do not care about exact results, approximations would be fine.
My Question is:
Given n persons and m days possible for birthdays. How to calculate the probability of k=2,3,4,5,... persons being involved in collisions?
Clarification
Apparently, the main problem is that I need to handle pretty big values. My dimensions are about:
"Lets say a year had 100,000 days (alternatively 1,000,000 days). Then think of a class of 50.000 kids. Whats the probability of everyone has a unique birthday. Whats the probability of 1-20, 20-50, 50-100 kids not having a unique birthday?"
As I said, results must not be perfectly exact.