7

Suppose there is a coin flipping game where you start with 5 dollars. At each turn, there is a $p$ probability of winning 1 dollar and a $1-p$ probability of losing 1 dollar. The game ends at 0 dollars or 10 dollars.

I am trying to answer the following questions:

  • Question 1: At the nth turn, what is the probability of having $i$ dollars?
  • Question 2: How long will it take (on average) for the game to end?

I know that this problem is a version of Gambler's Ruin (https://en.wikipedia.org/wiki/Gambler%27s_ruin) and that there well known solutions to this problem - but I am trying to derive them myself without using the strategies used to analyze Gambler's Ruin.

Here is my attempt:

To find the probability of being at position $ i $ after $ n $ turns, consider the number of ways we can achieve this through a combination of wins and losses. Define:

  • $ X_n $: Position after $ n $ turns.
  • $ k $: Number of wins.
  • $ n - k $: Number of losses.
  • $ p $: Probability of a win.
  • $ 1 - p $: Probability of a loss.
  • Initial position: $ X_0 = 5 $.

The position after $ n $ turns : $$ X_n = X_0 + k - (n - k) = 5 + 2k - n $$

The probability of being at $ X_n = i $: $$ i = 5 + 2k - n $$

Solving for $ k $: $$ 2k = i - 5 + n $$ $$ k = \frac{i - 5 + n}{2} $$

Here, $ k $ needs to be a valid integer within the number of turns: $$ 0 \leq k \leq n $$

Since $ k $ must be an integer: $$ k \geq \left\lceil \frac{i - 5 + n}{2} \right\rceil $$ $$ k \leq \left\lfloor \frac{i - 5 + n}{2} \right\rfloor $$

Therefore, $ k $ must lie within: $$ \max(0, \left\lceil \frac{i - 5 + n}{2} \right\rceil) \leq k \leq \min(n, \left\lfloor \frac{i - 5 + n}{2} \right\rfloor) $$

The probability of having exactly $ k $ wins and $ n - k $ losses is given by the binomial distribution: $$ P(X_n = i) = \sum_{k = \max(0, \lceil \frac{i - 5 + n}{2} \rceil)}^{\min(n, \lfloor \frac{i - 5 + n}{2} \rfloor)} \binom{n}{k} p^k (1-p)^{n-k} $$

This is where I get stuck: As $n$ becomes large, we need to consider all possible ways of ending up with $i$ dollars (based on the Binomial Distribution). If we don't know $n$, we have to consider all possible situations : the probability of getting to $n$ turns from there the probability of getting $k$ wins:

$$ P(X_{\infty} = i) = \sum_{n=0}^{\infty} P(N=n) \sum_{k = \max(0, \lceil \frac{i - 5 + n}{2} \rceil)}^{\min(n, \lfloor \frac{i - 5 + n}{2} \rfloor)} \binom{n}{k} p^k (1-p)^{n-k} $$

I don't know how to proceed from here. How can the limit of this infinite sum be analyzed to answer these questions?

Do we need to use Central Limit Theorem/Normal Approximation of the Binomial Distribution? Do we need to use Probability Generator Functions?

RobPratt
  • 50,938
konofoso
  • 681

2 Answers2

4

Model this as a random walk on a graph. You can consider a weighted directed graph with $11$ vertices. Where $v_i$ represents having $i$ dollars. At each state there is a chance to transition from $v_i$ to $v_{i+1}$ with probability $p$ and from $v_i$ to $v_{i-1}$ with probability $1-p$. The exceptions are the terminal vertices $v_{10}$ and $v_0$ which cannot move.

You can collect this information into a matrix $M$ where the $M_{ij}$ is the probability of moving from $v_i$ to $v_j$. (I know usually you count matrix indices starting from $1$, but in this case its better to start from $0$). $M$ is 11x11, it has entries of $p$ in the super diagonal and entries of $1-p$ in the sub diagonal — except for the top and bottom rows which only have $1$ in their main diagonal entries. Pardon me for not rendering $M$ in LaTeX — it is rather large, but its structure is straightforward to describe.

Now the probability of being at state $j$ after taking $n$ steps starting from state $i$ can be calculated simply by computing $(M^n)_{ij}$. This should answer your first question.

Now for the second question: let $T$ be the stopping time for the game. We have that, $\mathbb{E}(T) = \sum_{n \geq 1} \mathbb{P}(T > n)$. Where $\mathbb{P}(T > n) = 1- (M^n)_{5,0}-(M^n)_{5,10}$. The events of hitting zero dollars after n steps and hitting 10 dollars after n steps are mutually disjoint. The expressions for those two entries can be computed explicitly by diagonalizing M, however I expect the numbers to be very ugly.

EDIT: Taking a page out of the article. We can reorder the vertices and write

$$M = \begin{bmatrix} Q_{9 \times 9} & R_{9 \times 2} \\ 0_{2 \times 9} &I_{2 \times 2} \end{bmatrix}$$

Where $Q$ includes connections from transient states to transient states $(v_1, \dots v_9)$, $R$ represents connections from transient states to absorbing states $(v_0, v_{10})$ and $I$ represents the two absorbing states.

Now we can compute the stopping time in a different way:

$$(I - Q)^{-1} = \sum_{n \geq 0} Q^n $$

This is not immediately obvious but it is true that $Q$ must have all eigenvalues less than one and so this geometric series will converge. Entries of $Q^n$ represent the probability of being in a transient state after $n$ time steps (using the above representation of the expected value of stopping time), so calculating $(I-Q)^{-1}[1, 1, \dots 1 ]^T$ will give us the distribution of expected stopping times for each transient starting state.

Now we have that $I - Q$, (say $q = 1-p$ for simplicity) is a tridiagonal matrix

$$I - Q = \begin{bmatrix} 1 & -p & 0 & \dots & 0 \\ -q & 1 & -p & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & \cdots & -q & 1& -p \\ 0 & \cdots & 0 & -q & 1 \\ \end{bmatrix} $$

The solution to $(I-Q)\vec{x} = \textbf{1}$ can be solved by row reduction. We could also write it as a short recurrence relation.

We have that $$-qx_{n-1} + x_n - px_{n+1} = 1$$ for $n = 0, \dots 10$ subject to boundary conditions that $x_0 = x_{10} = 0$. I'll solve the case where $p < 1/2$, when $p = 1/2$ solution is different but a little simpler to compute.

You can check that $x_n = \frac{n}{1-2p} + y_n$ is a solution if $y_n$ is a solution to the homogeneous recurrence: $y_n - py_{n+1} - qy_{n-1} = 0$ subject to $y_0 = 0$ and $y_{10} = \tfrac{-10}{1-2p}$.

You can use the standard techniques to solve the homogeneous equation, we get that $$y_{n} = c_+\lambda_+^n + c_-\lambda_-^n$$ with $$ \lambda_{\pm} = \frac{1 \pm \sqrt{1-4pq}}{2p}.$$

And $c_{\pm}$ are chosen to satisfy the boundary conditions, a quick calculation shows that $$c_{+} = -c_- = \frac{10}{(\lambda_-^{10} - \lambda_+^{10})(1-2p)}.$$

Finally, the answer to your original question is

$$ x_5 = \frac{5}{1-2p} + \frac{10(\lambda_+^{5} - \lambda_-^{5})}{(1-2p)(\lambda_-^{10} - \lambda_+^{10})}$$

Lee Fisher
  • 2,749
  • See: https://en.wikipedia.org/wiki/Absorbing_Markov_chain – Lee Fisher Jul 21 '24 at 05:31
  • thanks, typo there. – Lee Fisher Jul 21 '24 at 13:55
  • I think I was wrong as well. A second thought seems to suggest $n\geq0$ in $\mathbb{E}(T)$? – Zack Fisher Jul 22 '24 at 03:52
  • 1
    For $p<1/2$, $\lambda_-=1$ and $\lambda_+=q/p$. So $$x_5=\mathbb{E}(T)=5\cdot \frac{1-3pq-p^2q^2}{p^5+q^5},$$which by symmetry and continuity, holds for all $0<p<1$. – Zack Fisher Jul 22 '24 at 05:47
  • @ Lee Fisher: thank you so much for your answer! Can this approach be used to derive the probability distribution of this time? – konofoso Jul 26 '24 at 04:26
  • You have an answer to question 2. Zack Fisher managed to simplify the result a fair amount. For example , the expected stopping time for a game with even odds looks to be 15 turns. – Lee Fisher Jul 26 '24 at 05:24
  • You also have an answer to question 1. The probability distribution after $n$ steps and starting with $i$ dollars is the $i^{th}$ row of $M^n$. – Lee Fisher Jul 26 '24 at 05:26
  • I also give an expression for $\mathbb{P}(T > n)$, this is basically the same as the cdf of $T$. – Lee Fisher Jul 26 '24 at 05:32
  • @ZackFisher In my answer the above $x_5=\Bbb E[T]$ has a similar expression, but with $1-3pq\color{red}{+} p^2q^2$ in the numerator. This then fits the simulation value for a fair coin. – dan_fulea Jul 26 '24 at 11:47
  • @dan_fulea You're right. That was my typo. Thanks! – Zack Fisher Jul 26 '24 at 12:02
1

Here is a way to consider the situation using a Markov chain with the states $0,1,2,3,4,5,6,7,8,9,T$. (We use $T$ instead of $10$ for the last terminal state for an easy typing.) Also let $q=1-p$ as usual.

A picture of this chain is:

$$ \overset 1\circlearrowleft \boxed 0 \underset {\color{red}{0}}{\overset q{\leftrightarrows}} 1 \underset p{\overset q{\leftrightarrows}} 2 \underset p{\overset q{\leftrightarrows}} 3 \underset p{\overset q{\leftrightarrows}} 4 \underset p{\overset q{\leftrightarrows}} 5 \underset p{\overset q{\leftrightarrows}} 6 \underset p{\overset q{\leftrightarrows}} 7 \underset p{\overset q{\leftrightarrows}} 8 \underset p{\overset q{\leftrightarrows}} 9 \underset p{\overset {\color{red}{0}}{\leftrightarrows}} \boxed T \underset 1\circlearrowleft \ . $$ The transition matrix $A$ and the initial distribution $e_5$ for given situation are: $$ e_5= \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \ ,\qquad A= \begin{bmatrix} 1\\ q&0&p\\ &q&0&p\\ &&q&0&p\\ &&&q&0&p\\ &&&&q&0&p\\ &&&&&q&0&p\\ &&&&&&q&0&p\\ &&&&&&&q&0&p\\ &&&&&&&&q&0&p\\ &&&&&&&&&&1\\ \end{bmatrix} \ . $$ Let $e_k$ be the row vector with only one $1$ entry for the position $k$. Then for instance $e_0A=e_0$, $e_TA=e_T$, so once in a terminal state ($0$ or $T$) at the next step (corresponding to multiplication by A$) we are also there.

Question 1 can now be answered, after $n$ turns the distribution probability is $$ e_5A^n\ . $$ For instance, after $10$ turns, in case of a fair coin, $p=q=1/2$ we have $$ e_5A^{10} = \left[0\,0\,0\,0\,0\,1\,0\,0\,0\,0\,0\right] \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{2} & 0 & \frac{1}{2} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{bmatrix}^{10} \\ = \begin{bmatrix} \frac{7}{64} & \frac{75}{1024} & 0 & \frac{25}{128} & 0 & \frac{125}{512} & 0 & \frac{25}{128} & 0 & \frac{75}{1024} & \frac{7}{64} \end{bmatrix} \ . $$ There is even in this case no simple formula, for instance: $$ \begin{aligned} e_5A^{20} &= \begin{bmatrix} \frac{137769}{524288} & \frac{11875}{262144} & 0 & \frac{124375}{1048576} & 0 & \frac{76875}{524288} & 0 & \frac{124375}{1048576} & 0 & \frac{11875}{262144} & \frac{137769}{524288} \end{bmatrix}\ , \\ e_5A^{50} &=\tiny \begin{bmatrix} \frac{251838013820031}{562949953421312} & \frac{353759765625}{35184372088832} & 0 & \frac{29636962890625}{1125899906842624} & 0 & \frac{18316650390625}{562949953421312} & 0 & \frac{29636962890625}{1125899906842624} & 0 & \frac{353759765625}{35184372088832} & \frac{251838013820031}{562949953421312} \end{bmatrix} \ . \end{aligned} $$ For a general formula we need to diagonalize $A$. In case $p=q=1/2$ the diagonal part has as elements the roots of the characteristic polynomial of $A$, which is: $$ x (x - 1)^{2} \left(x^{2} - \frac{1}{2} x - \frac{1}{4}\right) \left(x^{2} + \frac{1}{2} x - \frac{1}{4}\right) \left(x^{4} - \frac{5}{4} x^{2} + \frac{5}{16}\right) \ . $$


Let us look now at Question 2, which has the following simple way to attack. It is the same as in the answer of Lee Fisher, well, i am trying to solve the associated linear algebra problem as simple as possible.

Denote by $x_k$ the average number of steps to reach a terminal state ($0$ or $10=T$) when starting in $k$. Then of course $x_0=0$, $x_T=0$, and else we have the linear system: $$ \left\{ \begin{aligned} x_1 &= 1\color{gray}{+qx_0}+px_2\ ,\\ x_2 &= 1+qx_1+px_3\ ,\\ x_3 &= 1+qx_2+px_4\ ,\\ x_4 &= 1+qx_3+px_5\ ,\\ x_5 &= 1+qx_4+px_6\ ,\\ x_6 &= 1+qx_5+px_7\ ,\\ x_7 &= 1+qx_6+px_8\ ,\\ x_8 &= 1+qx_7+px_9\ ,\\ x_9 &= 1+qx_8\color{gray}{+px_T}\ . \end{aligned} \right. $$

Let $E_k$ be the equation for $x_k$. It has the following interpretation. Since $k$ is not terminal, there is at least one step till the end. After this one step we land in $k\pm1$ with probabilities $p,q$ (for $+$ and $-$ respectively), and from there we have in average $x_{k\pm1}$ steps.

To eliminate $x_1$ we do the following. Consider the first two equations, $E_1,E_2$. $x_1$ appears in them and only in them (on different sizes). Multiply the first equation, $E_1$ by $q$. Add them, $qE_1$ and $E_2$. Then $x_1$ is eliminated. (Do the same at the end, build $E_8$, $pE_9$, and add them.) In the result $qE_1+E_2$ look at the coefficient of $x_2$, then also in $E_3$. It turns out that we have to use the factors $q$, and $(1-pq)$. In terms of $E_1,E_2,E_3$ we thus use the factors $q^2,q,(1-pq)$. (Do the same at the end using the factors $p^2,p,(1-pq)$ for $E_9,E_8,E_7$.

This process continues two more steps, because of the symmetry, and we already obtain $x_5$, all other variables are eliminated. Explicitly:

We multiply all the above equations $E_1,E_2,E_3,E_4,E_5,E_6,E_7, E_8,E_9$ respectively by $$ q^4,\ q^3,\ q^2(1-pq),\ q(1-2pq),\ \boxed{\ 1-3pq+p^2q^2\ },\ p(1-2pq),\ p^2(1-pq),\ p^3,\ p^4\ . $$ Then we add. Only $x_5$ survives. And the equation in $x_5$ is: $$ (1-3pq+p^2q^2)x_5 = (p^4+q^4)+(p^3+q^3)+(p^2+q^2)(1-pq)+(p+q)(1-2pq)+(1-3pq+p^2q^2) + 2pq(1-2pq)x_5\ . $$ (It is a symmetric formula in $p,q$.) A first simplification is $$ (1-5pq+5p^2q^2)x_5 = ((1-2pq)^2-2p^2q^2)+(1-3pq)+(1-2pq)(1-pq)+(1-2pq)+(1-3pq+p^2q^2)\ , $$ and it gives a formula for $x_5$ as a fraction using only terms in $1,pq,p^2q^2$: $$ x_5=\frac {5(1-3pq+p^2q^2)} {1-5pq+5p^2q^2}\ . $$ It is not "homogeneous" in $p,q$, but if we want such a formula, we may replace $1$ by $1^4=(p+q)^4$ and $pq$ by $pq(p+q)^2$ to obtain $$ x_5= 5\cdot\frac {p^4+p^3q+p^2q^2+p^3q+q^4} {p^4-p^3q+p^2q^2-p^3q+q^4} = 5\cdot\frac {(p^5-q^5)/(p-q)} {(p^4+q^5)/(p+q)} \ . $$


Computer simulation: I am using sage. $10^5$ trials. Lazy implementation. Fair coin.

import random
N = 10^5
steps = 0 # so far, but we add

for trial in range(N): state = 5 while state not in (0, 10): state += random.choice([-1, +1]) steps += 1

print(f"Average number of steps this time: {(steps/N).n()}")

And the result is:

Average number of steps this time: 24.9387600000000

which is consistent with the formula $$ x_5= 5\cdot\frac {p^4+p^3q+p^2q^2+p^3q+q^4} {p^4-p^3q+p^2q^2-p^3q+q^4} \ . $$ (Inserting $p=q$ leads to $X_5=5\cdot\frac{1+1+1+1+1}{1-1+1-1+1}$.)

dan_fulea
  • 37,952