I am trying to understand the concept of "Hitting Time Distributions" in Probability.
Example 1: In the first example, consider a case where there is a coin that has a $p_1$ probability of Heads. We need to find out the probability distribution for the number of flips required to get 100 heads (i.e. not 100 heads consecutively, just 100 heads).
Suppose I call the time required to observe 100 heads for the first time as the Hitting Time. To me, it seems that I might be solvable using the Negative Binomial Distribution (https://en.wikipedia.org/wiki/Negative_binomial_distribution):
$$P(X = k) = \binom{k-1}{r-1} p_1^r (1 - p_1)^{k-r}$$ $$P(X \leq n) = \sum_{k=r}^n \binom{k-1}{r-1} p_1^r (1 - p_1)^{k-r}$$
The Negative Binomial Distribution describes the number of trials before the first failure. Logically, I can treat the 100th head as a "failure". Therefore, I can modify these equations for my problem.
Example 2: In the second problem, suppose we have the same coin - but now we are interested in finding out the probability distribution for the number of flips needed to get 5 consecutive heads. Here, I call the number of flips needed to get 5 consecutive heads for the first time as the Hitting Time.
At first I thought that the Negative Binomial Distribution can also be used for this problem, but I am unsure how to modify the Negative Binomial Distribution to suit this problem.
Doing some readings, I found out that these types of problems are quite popular and are often solved using methods such as First Step Analysis (e.g. First Step Analysis of a Markov Chain process). By writing a linear system of equations based on a Discrete Time Markov Chain, we can find out the Expected Value for the number of flips needed for 5 consecutive heads:
$$P = \begin{pmatrix} 1 - p_1 & p_1 & 0 & 0 & 0 & 0 \\ 1 - p_1 & 0 & p_1 & 0 & 0 & 0 \\ 1 - p_1 & 0 & 0 & p_1 & 0 & 0 \\ 1 - p_1 & 0 & 0 & 0 & p_1 & 0 \\ 1 - p_1 & 0 & 0 & 0 & 0 & p_1 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ \end{pmatrix}$$
$$ E_5 = 0 \quad (\text{ absorbing state}) $$ $$ E_0 = 1 + (1 - p_1) E_0 + p_1 E_1 $$ $$ E_1 = 1 + (1 - p_1) E_0 + p_1 E_2 $$ $$ E_2 = 1 + (1 - p_1) E_0 + p_1 E_3 $$ $$ E_3 = 1 + (1 - p_1) E_0 + p_1 E_4 $$ $$ E_4 = 1 + (1 - p_1) E_0 $$
But this only tells us the expected number of flips. How do I find out the probability distribution for this number?
Reading some other questions (e.g. Generating function for the hitting time of a Markov chain. , Markov Chain Question using probability generating function, Method of Generating function - Markov Chain , https://www.youtube.com/watch?v=OSNlxH22ln8), it seems like this can be solved using Probability Generating Functions (https://en.wikipedia.org/wiki/Probability-generating_function), but I am not sure how to do this.
I tried to set up similar equations as First Step Analysis using Probability Generating Functions:
$$ G_i(z) = \sum_{j \in S} p_{ij} z^j $$
$$ G_0(z) = z + p_1 z G_1(z) + (1 - p_1) z G_0(z) $$ $$ G_1(z) = z + p_1 z G_2(z) + (1 - p_1) z G_0(z) $$ $$ G_2(z) = z + p_1 z G_3(z) + (1 - p_1) z G_0(z) $$ $$ G_3(z) = z + p_1 z G_4(z) + (1 - p_1) z G_0(z) $$ $$ G_4(z) = z + p_1 z G_5(z) + (1 - p_1) z G_0(z) $$ $$ G_5(z) = z $$
$$ \begin{pmatrix} G_0(z) \\ G_1(z) \\ G_2(z) \\ G_3(z) \\ G_4(z) \end{pmatrix} = \begin{pmatrix} z \\ z \\ z \\ z \\ z + p_1 z^2 \end{pmatrix} + \begin{pmatrix} (1 - p_1) z & p_1 z & 0 & 0 & 0 \\ (1 - p_1) z & 0 & p_1 z & 0 & 0 \\ (1 - p_1) z & 0 & 0 & p_1 z & 0 \\ (1 - p_1) z & 0 & 0 & 0 & p_1 z \\ (1 - p_1) z & 0 & 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} G_0(z) \\ G_1(z) \\ G_2(z) \\ G_3(z) \\ G_4(z) \end{pmatrix} $$
$$G_0(z) = \sum_{n=0}^{\infty} P(T = n) z^n$$
But I think I have done everything incorrectly and feel lost. I am not sure how this will all lead to a Probability Distribution Function for the Hitting Time.
Can someone please show me how I can correctly use Probability Generating Function to derive the Probability Distribution Function in this question?
- Note: Follow Up Question Relationship Between Cramer's Rule and Probability Functions?