3

Using Martingales, am trying to prove that the ratio of balls in Polya's Urn (https://en.wikipedia.org/wiki/P%C3%B3lya_urn_model) approaches a Beta Distribution as the number of turns go to infinity. I found a previous question that showed this Polya's urn model - limit distribution , but I want to try and repeat this analysis by myself.

In Polya's Urn, we start with some number of Red Balls and some number of Blue Balls. At each turn, we randomly draw a ball (each ball has equal probability of being selected) and then put it back - along with another ball of the same color.

Suppose:

  • At t=0, there are $m$ red balls and $n$ blue balls
  • At t=t, $k$ red balls have been chosen

Part 1: In my previous question (Probability of Drawing Balls from a Hat with Changing Number of Balls), I showed that he joint probability of having $m+k$ red balls at turn $t$ is:

$$ P(R_t = m+k) = \binom{t}{k} \cdot \prod_{i=0}^{k-1} \frac{m+i}{m+n+i} \cdot \prod_{j=0}^{t-k-1} \frac{n+j}{m+n+k+j} $$

Part 2: I can also define the ratio of balls at the $t^{th}$ turn:

$$ R_t = \frac{m+k}{m+n+t} $$

And using this same logic, I can also write an expression for the Expected Value of the Ratio at the $t^{th}$ turn:

  • For the t+1 turn, we have two possibilities:
  • Choose a red ball: probability $\frac{m+k}{m+n+t}$, resulting ratio $\frac{m+k+1}{m+n+t+1}$
  • Choose a blue ball: probability $\frac{n+t-k}{m+n+t}$, resulting ratio $\frac{m+k}{m+n+t+1}$

The expected ratio at $t+1$ given $k$ red balls were chosen up to turn $t$ is:

$$ E(R_{t+1} | k) = \frac{m+k}{m+n+t} \cdot \frac{m+k+1}{m+n+t+1} + \frac{n+t-k}{m+n+t} \cdot \frac{m+k}{m+n+t+1} $$

The above equation was for the situation where exactly $k$ red balls were chosen - we need to write the expectation for all possible values of $k$. Using the law of total expectation, we can write:

$$ E(R_{t+1}) = \sum_{k=0}^t E(R_{t+1} | k) \cdot P(R_t = m+k) $$

Substituting:

$$ \begin{align*} E(R_{t+1}) &= \sum_{k=0}^t \left[ \frac{m+k}{m+n+t} \cdot \frac{m+k+1}{m+n+t+1} + \frac{n+t-k}{m+n+t} \cdot \frac{m+k}{m+n+t+1} \right] \\ &\quad \cdot \binom{t}{k} \cdot \prod_{i=0}^{k-1} \frac{m+i}{m+n+i} \cdot \prod_{j=0}^{t-k-1} \frac{n+j}{m+n+k+j} \end{align*} $$

Simplification:

$$ \begin{align*} E(R_{t+1}) &= \sum_{k=0}^t \left[ \frac{(m+k)^2 + (m+k)}{(m+n+t)(m+n+t+1)} \right] \\ &\quad \cdot \binom{t}{k} \cdot \prod_{i=0}^{k-1} \frac{m+i}{m+n+i} \cdot \prod_{j=0}^{t-k-1} \frac{n+j}{m+n+k+j} \end{align*} $$

Part 3: I can also show that the Ratio $R_t$ is a Martingale. For $R_t$ to be Martingale, the following would have to be true:

$$ E[R_{t+1} | R_t] = \frac{m+k}{m+n+t} = R_t $$

To calculate $E[R_{t+1} | R_t]$, we need to identify the possibilities at $t1$:

  • Choose a red ball: probability $\frac{m+k}{m+n+t}$
  • Choose a blue ball: probability $\frac{n+t-k}{m+n+t}$

$$ \begin{align*} E[R_{t+1} | R_t] &= \frac{m+k}{m+n+t} \cdot \frac{m+k+1}{m+n+t+1} + \frac{n+t-k}{m+n+t} \cdot \frac{m+k}{m+n+t+1} \\[10pt] &= \frac{(m+k)(m+k+1)}{(m+n+t)(m+n+t+1)} + \frac{(n+t-k)(m+k)}{(m+n+t)(m+n+t+1)} \\[10pt] &= \frac{(m+k)[(m+k+1) + (n+t-k)]}{(m+n+t)(m+n+t+1)} \\[10pt] &= \frac{(m+k)(m+n+t+1)}{(m+n+t)(m+n+t+1)} \\[10pt] &= \frac{m+k}{m+n+t} \end{align*} $$

And, earlier, we saw that:

$$ R_t = \frac{m+k}{m+n+t} $$

Thus we can see that:

$$ E[R_{t+1} | R_t] = \frac{m+k}{m+n+t} = R_t $$

Suggesting that $R_t$ is a Martingale.

Here is where I am stuck - can someone please show me how to continue the proof using Martingales?

Alex Ravsky
  • 106,166
konofoso
  • 681
  • I don't think Martingales are a very natural tool for proving the limit distribution. I tend to think of Martingales as more useful in terms of going from $X_n$ for high $n$ down to lower $n$ which would be the opposite. – Einar Rødland Aug 05 '24 at 16:56
  • You have proven that $(R_t){t}$ is a martingale, it is also a non-negative random variable, thus you can apply Doob's convergence theorem Wikipedia to show that $R_t$ converges almost surely towards some random variable $R\infty$.

    Now to deduce what is the law of $R_\infty$, there is one main method, that is to compute its characteristic function (or its moment generating function, or its raw moments, they all come up to be the same). This is what is done in the previous post.

    – Damian Cid Aug 28 '24 at 08:01
  • @ Einar: thank you for your reply! – konofoso Aug 28 '24 at 21:39
  • @DamianCid : thank you for your reply! can you please show an answer if you have time? – konofoso Aug 28 '24 at 21:40

1 Answers1

1

With $X_t$ being the number of red balls after the $t$th draw (and replacement), as derived in the OP, we can write $$\begin{align} P(X_t=m+k)&=\binom{t}{k} \frac{m^{\bar{k}} n^{\overline{t-k}}}{(m+n)^{\bar{t}}}\\ &=\frac{t!}{k!(t-k)!} \cdot \frac{(m+k-1)!}{(m-1)!} \cdot\frac{(n+t-k-1)!}{(n-1)!} \cdot\frac{(m+n-1)!}{(m+n+t-1)!}\\ &=\frac{(m+n-1)!}{(m-1)!(n-1)!} \cdot \frac{t!}{(t+m+n-1)!}\cdot \frac{(k+m-1)!}{k!} \cdot \frac{(t-k+n-1)!}{(t-k)!}. \end{align}$$

With $R_t=\frac{X_t}{m+n+t}$, we have $$\begin{align} P(X_t=m+k)&=P(m+k-\frac12< X_t \leq m+k+\frac12)\\ &=P\left(r-\frac{1}{2(m+n+t)} < R_t \leq r+\frac{1}{2(m+n+t)} \right) \end{align}$$ where $r= \frac{m+k}{m+n+t}$.

Plugging in $k=rt+rn+rm-m$ and $t-k=(1-r)t-rn-rm+m$, we have $$\begin{align} {}&P\left(r-\frac{1}{2(m+n+t)} < R_t \leq r+\frac{1}{2(m+n+t)} \right)\\ =&\frac{(m+n-1)!}{(m-1)!(n-1)!} \cdot \frac{t!}{(t+m+n-1)!}\cdot \frac{(rt+rn+rm-1)!}{(rt+rn+rm-m)!} \cdot \\ {}& \quad \frac{[(1-r)t-rn-rm+m+n-1]!}{[(1-r)t-rn-rm+m]!} \end{align}$$

But for large $z$, $$\frac{(z+a)!}{(z+b)!}\asymp\frac{\Gamma(z+a)}{\Gamma(z+b)} \asymp z^{a-b}, $$ which can be verified through Stirling's approximation. Thus,

$$\begin{align} {}&P\left(r-\frac{1}{2(m+n+t)} < R_t \leq r+\frac{1}{2(m+n+t)} \right)\\ \asymp &\frac{(m+n-1)!}{(m-1)!(n-1)!} \cdot t^{1-m-n} \cdot (rt)^{m-1} \cdot [(1-r)t]^{n-1} \\ =&\frac{1}{B(m,n)} t^{-1} r^{m-1} (1-r)^{n-1}. \end{align}$$ Taking limit as $t\longrightarrow\infty$ to cancel the $O(t^{-1})$ factor on both sides, we have the density of $R_t$ at $r\in(0,1)$ to be $$f_{R_t}(r)= \frac{1}{B(m,n)} r^{m-1} (1-r)^{n-1}, $$ which is a $\text{Beta}(m,n)$ density.

Zack Fisher
  • 2,481
  • @ Zack Fisher: Thank you so much. I have accepted your answer as the official answer. If you have time later can you please check this out? https://math.stackexchange.com/questions/4949881/probability-of-seeing-x-of-balls-in-y-turns . You post really high quality, comprehensive and easy to follow answers (for a layperson like myself) ... could you please write an answer for this question https://math.stackexchange.com/questions/4949881/probability-of-seeing-x-of-balls-in-y-turns if you have time? much appreciated – konofoso Aug 29 '24 at 02:01
  • 1
    @konofoso Thanks for the comment. I'm a layperson as well, esp. in terms random processes. My replies are mostly only calculations not involving the core ideas from random processes. I'm amazed by your questions and learning from other's comments on/answers to your questions. – Zack Fisher Aug 29 '24 at 16:29
  • Thank you for your kind words. Can you please explain, what do you mean "amazed by my questions"? Are they really that amazing lol? – konofoso Sep 01 '24 at 03:20
  • BTW - I was thinking about this question for the whole week and finally got around to posting it lol! https://math.stackexchange.com/questions/4965592/can-the-same-stochastic-process-have-2-different-likelihoods – konofoso Sep 01 '24 at 03:20
  • @ Zack: any ideas on this one? – konofoso Sep 06 '24 at 21:06