This question is a follow-up to a previous question I posted Probability of Seeing "X" % of Balls in "Y" Turns?
Set up:
Suppose we have integers 1,2,3...99, 100
Each integer has an equal probability of being selected Game:
In round=1, we pick 5 numbers randomly without replacement and then put them back
In round=2 we again pick 5 numbers randomly without replacement and then put them back
We do this until round = 100
I wrote an R program to simulate this situation:
round numbers_picked cumulative_unique_numbers_seen percent_of_new_numbers
1 31, 79, 51, 14, 67 5 100
2 42, 50, 43, 14, 25 9 80
3 90, 91, 69, 99, 57 14 100
4 92, 9, 93, 72, 26 19 100
5 7, 42, 9, 83, 36 22 60
6 78, 81, 43, 76, 15 26 80
7 32, 7, 9, 41, 74 29 60
8 23, 27, 60, 53, 7 33 80
9 53, 27, 96, 38, 89 36 60
10 34, 93, 69, 72, 76 37 20
11 63, 13, 82, 97, 91 41 80
12 25, 38, 21, 79, 41 42 20
13 47, 90, 60, 95, 16 45 60
14 94, 6, 72, 86, 97 48 60
15 39, 31, 81, 50, 34 49 20
I am wondering if there is some probability distribution that can be used to answer the following question:
Suppose we are currently at round = n and we have seen "m" unique numbers. If we DO NOT know that there are 100 total numbers - what is the probability we will have seen 99% of all numbers by round = k? (k>n)
I have been beginning to learn about mathematical biology models and their use in these kinds of problems. For example:
- https://en.wikipedia.org/wiki/Mark_and_recapture
- https://stephens999.github.io/fiveMinuteStats/wright_fisher_model.html
- https://palaeo-electronica.org/2011_1/238/estimate.htm (chao estimator)
Can someone please suggest if there are some mathematical biology models that can be used in this problem?