There's a monkey randomly typing on a 26 letter keyboard, and the probability of each letter appearing is $\dfrac{1}{26}$. We need to find out the expected time until the monkey types a certain word. After some thinking and scouring the internet for resources, I now know the following things:
Type I: For words in which for no $i<m=$length(word), the substring of the first $i$ letters of the word equals the substring of the last $i$ letters of the word, the expected time until the monkey types the word is just $26^m$.
Type II: If not, let $(i_i,i_2,i_3\ldots,i_k)$ be the length of substrings for which this does happen. Then the expected time until the monkey types the word is $26^m + 26^{i_k} + 26^{i_{k-1}}+\cdots+26^{i_1}$.
Source: statement and link to proof using martingales
I understand the intuition behind why the expectations are different and I have seen the Martingale theoretic proof for this problem (with the gambler's casino).
The questions I have:
- Can we prove the Type I result using just Markov Chain theory? If so, how?
- Can we prove the Type II result using just Markov Chain theory? If so, how?
What I had thought of: (which I believe is completely incorrect, but unable to find out why)
I considered each letter the monkey types to be a Markov chain $\{X_n\}_{n\geq 0}$ with a transition matrix $P$ with $p_{ij}=\dfrac{1}{26}$. This is an irreducible, finite state space Markov chain, therefore it is postive recurrent and has a unique stationary distribution $\pi$ such that $\pi_i=\dfrac{1}{26}$.
Now, I define $Y_n=(X_n,X_{n+1},\ldots,X_{n+m-1})$. So basically, words of length $m$. Then, we can easily see that $\{Y_n\}_{n\geq 0}$ is also a Markov chain with a transition matrix $Q$ such that $q_{(i_1,\ldots,i_m)(j_1,\ldots,j_m)}=\dfrac{1}{26}$ iff $i_{k}=j_{k-1}$ for $1<k\leq m$ and $0$ otherwise.
We see that the stationary distribution of $\{Y_n\}_{n\geq 0}$ is actually $\Phi$ such that $\Phi_{(i_1,\ldots,i_m)}=\pi_{i_1}p_{i_1i_2}p_{i_2i_3}\cdots p_{i_{m-1}i_m}=\dfrac{1}{26^m}$.
There is also a result that states that
An irreducible postive recurrent Markov chain $\{Z_n\}_{n\geq 0}$ with a state space $S$ has a unique stationary distribution given by $\psi_i=\dfrac{1}{\mathbb{E}(\tau_i)}$ for $i\in S$, where $\tau_i=\inf\{k:Z_k=i\}$, that is, the first hitting time of state $i$.
Since our Markov chain $\{Y_n\}_{n\geq 0}$ satisfies all these conditions, we see that, $\Phi_{(i_1,\ldots,i_m)}=\dfrac{1}{\mathbb{E}(\tau_{(i_1,\ldots,i_m)})}$. But we already know that $\Phi_{(i_1,\ldots,i_m)}=\dfrac{1}{26^m}$.
Therefore, $\mathbb{E}(\tau_{(i_1,\ldots,i_m)})=26^m$, which is just the expected time until our word gets typed.
The problem: You may have noticed that nowhere have I used the structure of our word, and hence this proof makes no distiction between words of Type I and Type II, and hence fails miserably. Although I can't find where exactly the problem is.
Please let me know what the problem is with my proof and what can I do to correct it and if not, how else can I prove this with just Markov chain theory. Thanks a lot for your help and patience.