3

I found a card problem in an old book that essentially boiled down to this question: Suppose we have a random walk $S_n = X_1 + X_2 + \ldots + X_n$ that starts at $0$, where $X_i \hspace{0.1cm}$ is $1$ or $ -1$ with equal probability of $0.5$

We know that $S_{n} = 0$ and from that point on the walk no longer changes. (i.e it is only "walking" for n steps, and stops at $0$). The walk is n steps long

  • Let $Z = \max(S_1,S_2,\ldots S_n)$. What is $E[Z]$? What is the distribution of Z?

Here is what I've managed so far: My first instinct was to use the reflection principle to count the number of walks that hit a height $k$ but do not surpass it. The only way I could manage this was in the following way: We know that there are $\binom{i}{\frac{i+k}{2}}$ paths between $(0,0)$ and a point $(i,k)$. Using the reflection principle, $\binom{i}{\frac{i+k+2}{2}}$ of these paths touch $k+1$ (meaning k would not be the maximum). So altogether there are $\binom{i}{\frac{i+k}{2}} - \binom{i}{\frac{i+k+2}{2}} = \frac{2k+2}{i+k+2} \cdot \binom{i}{\frac{i+k}{2}}$ paths from $(0,0)$ to $(i,k)$ that don't go higher than k.

Applying the same logic you can find that the number of paths from $(i,k)$ to $(n,0)$ that don't go higher than $k$ is $\frac{2k+2}{n-i+k+2} \cdot \binom{n-i}{\frac{n-i+k}{2}}$.

Now the way to go forward would be to multiply these two quantities to get the total number of paths for a given i, sum over all possible i, and divide by $\binom{n}{\frac{n}{2}}$ (total number of paths) to get $P(Z = k)$

But this seems to be extremely ugly and I can't figure out any way to work out this "sum over i's". I feel like there's got to be a simpler approach to this problem, and I can't figure out a way to simplify this sum. I also haven't found any MSE posts that answer this question for a "bridge"-like walk like in my case (I'm aware of the infinite case)

Any ideas?

  • I believe this is a well-known result; the probability of reaching at least $|x|=a$ before returning to the origin is just $1/a$. – mjqxxxx Jul 11 '24 at 21:34
  • @mjqxxxx that doesn't seem to make much sense to me; should the probability not definitely be dependent on the length of your walk, N? Again I'm not dealing with the case of an infinite walk here; the walk returns to 0 within N moves – Carlos Rosales Jul 11 '24 at 21:39
  • (And if that's correct, then the expected value of the maximum distance from the origin is infinite.) – mjqxxxx Jul 11 '24 at 21:39
  • Oh, do you want the maximum distance from the origin conditioned on the length of the walk being $N$? Sorry, I didn't read that correctly. Certainly can't be more than $N/2$, in that case :). I thought you were also treating $N$ as a random variable. – mjqxxxx Jul 11 '24 at 21:41
  • I see the confusion. Maybe I shouldn't have capitalized N. I've changed it to n, lowercase, in the post. – Carlos Rosales Jul 11 '24 at 21:46
  • 1
    Your approach to figure out a rationale for the answer is certainly what needs to be done. But to get there sometimes it helps to know the answer which for even n is $$\text{Pr}(Z=z)=\left(\binom{n}{\frac{n}{2}-z}-\binom{n}{\frac{n}{2}-z-1}\right)/\binom{n}{\frac{n}{2}}$$ – JimB Jul 11 '24 at 22:32
  • @JimB Wow, I appreciate that. I'll try to get to the answer myself from where I left off then. Just out of curiosity, did you derive this on the spot or did you find the result somewhere? – Carlos Rosales Jul 11 '24 at 22:46
  • 1
    I cheated. I generated all possible arrangements of $n/2$ -1's and $m/2$ +1's for various values of even values of $n$. I then found the maximum accumulated total for each arrangement and determined the frequencies of occurrence. I then looked up the frequencies in oeis.org and found that the frequencies matched several sequences. One of those sequences is https://oeis.org/A039599. Now it's much more desired to "think of the data generation process" but this approach can be successful in less time (when it works, that is). – JimB Jul 11 '24 at 23:01
  • 1
    Intuitively, I feel Catalan numbers are thickly involved, but I'm not exactly sure how yet. – Brian Tung Jul 11 '24 at 23:08

1 Answers1

1

I will use the notation $M_{n}=\max(S_1,\dots,S_{n}$). I can prove that $$ E[M_n]\sim \sqrt{\pi n/8}\qquad \text{as }n\to\infty. $$ First, write $$ E[M_n]=\sum_{m=1}^{n/2}P(M_n\ge m). $$ We can compute $P(M_n\ge m)$ using the reflection principle. Random walks which start at zero, eventually reach $+m$, and then end at zero, are in bijection with random walks starting at zero and ending at $+2m$. To count the number of paths where $M_n\ge m$, we apply this reflection bijection, and note the reflected paths must have $n/2+m$ up-steps and $n/2-m$ down-steps, so $$ P(M_n\ge m)=\binom{n}{n/2+m}\Big/\binom{n}{n/2} . $$ Therefore, we just need to compute $\sum_{m=1}^{n/2}\binom{n}{n/2+m}\big/\binom{n}{n/2} $. For this, the following asymptotic lemma is useful.

Lemma: As $n,m\to\infty$ in such a way that $m\le n^{2/3}$, then $$\binom{n}{n/2+m}\Big/\binom{n}{n/2} \sim e^{-2m^2/n}\cdot \left(1+O\left(n^{-1/3}\right)\right)$$If instead $m,n\to\infty$ with $m\ge n^{2/3}$, then $$\binom{n}{n/2+m}\Big/\binom{n}{n/2}=O(\exp(-n^{1/3}))$$

Before I prove the lemma, let me show why it is useful. To compute $\sum_{m=1}^{n/2}\binom{n}{n/2+m}\big/\binom{n}{n/2}$, we split the sum into two regimes, one where $m\le n^{2/3}$, and one where $m>n^{2/3}$. The first regime becomes a Riemman sum: $$ \begin{align} \sum_{m=1}^{n^{2/3}} \binom{n}{n/2+m}\big/\binom{n}{n/2} &\sim \sum_{m=1}^{n^{2/3}} e^{-2m^2/n} \\&=\sqrt{n}\sum_{m=1}^{n^{2/3}} \frac1{\sqrt n}e^{-2(m/\sqrt{n})^2} \\&\sim \sqrt n\int_0^\infty dx \,e^{-2x^2} \\&=\sqrt{\frac{\pi n}{8}}. \end{align} $$ The second sum can then shown to be negligible compared to the first.

Proof of Lemma: Write $$\binom{n}{n/2+m}\big/\binom{n}{n/2}=\left(\frac{(n/2)!}{(n/2-m)!(n/2)^m}\right)\Big/ \left(\frac{(n/2+m)!}{(n/2)!(n/2)^m}\right).\tag1$$ Using this earlier answer of mine, I showed that $$ \frac{n!}{(n-k)!n^k}= \begin{cases} e^{-k^2/2n}(1+O(n^{-1/3})) & \text{if }k\le n^{2/3} \\ O\big(\exp(-n^{-1/3})\big) & \text{if }k\ge n^{2/3} \end{cases}\tag2 $$ Using the exact same method, you can additionally prove $$ \frac{(n+k)!}{n!n^k}= \begin{cases} e^{k^2/2n}(1+O(n^{-1/3})) & \text{if }k\le n^{2/3} \\ O\big(\exp(-n^{-1/3})\big) & \text{if }k\ge n^{2/3}\tag3 \end{cases} $$ Apply $(2)$ to the numerator of $(1)$, and apply $(3)$ to the denominator of $(1)$, and the lemma is proved. $\tag*{$\blacksquare$}$

Mike Earnest
  • 84,902
  • 1
    $\sum_{m=1}^{n/2}\binom{n}{n/2+m}\big/\binom{n}{n/2}$ can be computed exactly as $-\frac{n}{2 (n+2)}-\frac{1}{n+2}+\frac{\sqrt{\pi } \Gamma \left(\frac{n}{2}+1\right)}{2 \Gamma \left(\frac{n}{2}+\frac{1}{2}\right)}$. – JimB Jul 12 '24 at 04:28
  • Wow, great solution. I appreciate how you split the sum into two regimes. – Carlos Rosales Jul 12 '24 at 04:36
  • I've gotten a simplified form for the pmf: P(Max = z) = $\frac{4z + 2}{n + 2z + 2} \cdot \frac{(\frac{n}{2})!^2}{(\frac{n}{2} + z)! \cdot (\frac{n}{2} - z)!}$. It seems that this formulation draws in some way from the catalan numbers, and there may be a nice identity to precisely calculate the expectation. – Carlos Rosales Jul 12 '24 at 04:44
  • Happy to help. I’m not sure about the Catalan connection. By the way, can you tell me about the card trick this is related to? I love magic, especially mathematical magic ‍♂️ – Mike Earnest Jul 12 '24 at 15:05
  • 1
    For sure; actually it's more of a game than a trick. Here's the problem statement: You have 52 playing cards (26 red, 26 black). You draw cards one by one. A red card pays you a dollar. A black one fines you a dollar. You can stop any time you want. Cards are not returned to the deck after being drawn. What is the optimal stopping rule in terms of maximizing expected payoff? Also, what is the expected payoff following this optimal rule? I was confused by this question and realized I could model it using a simple random walk. Then I became more interested in answering the walk question. – Carlos Rosales Jul 12 '24 at 15:41