Optimal permutation of transition probabilities in random walk to minimize expected stopping time

Question

Background

Consider the set of integers $\{1,\dots,n+1\}$ and a set of probabilities $p_1,\dots, p_n \in(0,1)$. We now define a random walk/Markov chain on these states via the following transition matrix $$ M = \begin{pmatrix} 1-p_1 & p_1 & 0 & \dots & &0 \\ 1-p_2 & 0 & p_2 & 0 & & \vdots\\ 0 & 1-p_3 & 0 & p_3 & & \\ \vdots & \ddots & \ddots & \ddots & \ddots& & \\ 0 & & &1-p_n & 0 & p_n \\ 0 &\dots & && 0& 1 \end{pmatrix}. $$ This represents a process where the state $k$ is increased by $1$ with probability $p_k$ and decreased by $1$ with probability $1-p_k$. The boundary conditions are that the state $1$ stays the same with probability $1-p_1$ and the state $n+1$ is an absorbing state. I am interested in the expected stopping time given that the process starts at the state $1$: $E(\tau|X_0=1)$, where $\tau = \min~\{k:X_k=n+1\}$. I am not interested in explicitly computing this value, but it may be helpful.

Problem

Now consider the case where the $n$ forward transition probabilities are permuted. That is, given $\sigma \in S_n$, the transition matrix is now $$ M(\sigma) = \begin{pmatrix} 1-p_{\sigma(1)} & p_{\sigma(1)} & 0 & \dots & &0 \\ 1-p_{\sigma(2)} & 0 & p_{\sigma(2)} & 0 & & \vdots\\ 0 & 1-p_{\sigma(3)} & 0 & p_{\sigma(3)} & & \\ \vdots & \ddots & \ddots & \ddots & \ddots& & \\ 0 & & &1-p_{\sigma(n)} & 0 & p_{\sigma(n)} \\ 0 &\dots & && & 1 \end{pmatrix}. $$ For a given permutation, denote the expected stopping time as $T(\sigma)$. My question is, is there a good way to find a permutation that minimizes $T$? That is, find $\sigma^\star = \mathrm{argmin}_{\sigma\in S_n} T(\sigma)$. Dynamic programming and combinatorial optimization routines are likely viable on this problem for small enough $n$, but I would prefer a proof if possible.

Some Thoughts

Most of my knowledge on computing expected stopping times of random walks deals with IID state transitions so they are of little use here. It wouldn't surprise me if some sort of greedy arrangement were optimal, but I am struggling to reason with this process without resorting to counting possible walks on the state space which gets into some messy combinatorics pretty quickly. I have been able to show a crude lower bound of $$ n + \frac{1-p_{\sigma(1)}}{p_{\sigma(1)}^2}\prod_{k=1}^n p_k < T(\sigma), $$ which suggests that maximizing $p_{\sigma(1)}$ by be important, but this is far from sharp.

I would hope that there is something from the theory of stochastic processes regarding the expectation that can simplify this analysis. Thanks for reading and I look forward to your responses.

Edit

When investigating possible paths and computing $P(\tau=n+m)$ for small $m$, I have come across a few helpful observations.

In terms of sheer number of possible paths, the state $1$ is the most common state. This is not unexpected, since every path starts there and it is the only state that can be attained consecutively. Maximizing $p_{\sigma(1)}$ seem would likely minimize the time spent stuck at this state in long paths, lowering the overall expectation.
Given a state $2 \leq k \leq n$, the probability of decreasing then returning within some fixed number of steps is proportional to $p_{\sigma(k-1)}(1-p_{\sigma(k)})$. When just considering these two transition probabilities, this quantity is maximized when $p_{\sigma(k-1)} > p_{\sigma(k)}$. Extending this logic to every interior state may indicate that a nonincreasing arrangement is optimal.

Final Edit:

As the answers indicate, it is unlikely that any general theoretical treatment of this problem will result in an optimal solution, but the reduction to a mixed integer linear programming problem is a considerable development that renders this problem far more tractable than the combinatorial framing initially indicated. I have awarded the bounty for Amir's answer containing the details on the MILP problem and some example solutions showing some nonintuitive results. Thanks for all of the discussion and feedback on this problem.

In my particular application it was $\approx20$. I suspect for small enough $n$ there are numerical and heuristic techniques that could provide an answer but I think a proof would be more illuminating — whpowell96, Jul 21 '24 at 15:54
Experimentation with randomly generated $p$ and a mixed integer linear programming formulation reveals that an optimal permutation need not maximize $p_{\sigma(1)}$ nor be nonincreasing. Do you have an example $p$ you are interested in? — RobPratt, Jul 22 '24 at 21:55
Wow great find Rob! I will have to read this more closely to determine how much of this can be translated to this problem. Their random walk had slightly different boundary conditions at $X=1$, but they find that the optimal arrangement (for maximizing $T$) is unique and depends only on the relative ordering of $p_k$, not the values themselves. — whpowell96, Jul 24 '24 at 18:49

Zoe Allen · Answer 1 · 2024-07-20T06:57:04.470

I have a way of reformulating the problem that may be helpful:

Let $t_r$ be the expected stopping time starting from position $r$. Then we have, for $2 \le r \le n$: $$t_r=p_r t_{r+1}+(1-p_r)t_{r-1}+1$$ which rearranges to $$-1=p_r (t_{r+1}-t_r)+(1-p_r)(t_{r-1}-t_r)$$ so letting $d_r$=$t_r-t_{r+1}$ we have $$d_r=\frac{1}{p_r}(1+(1-p_r)d_{r-1})$$ We know that $t_{n+1}=0$ so we can recover $t_1$ as $\sum_{i=1}^n d_i$. Now letting $\alpha_r=\frac{1-p_r}{p_r}$, $e_r=d_r+1$ $$e_r=\alpha_r(e_{r-1})+2$$ similarly $$t_1=1+p_1 t_2 + (1-p_1)t_1$$ rearranges to give $d_1=\frac{1}{p_1}=\alpha_1+1$ so $e_1=\alpha_1+2$. We can neaten our initial conditions by pretending we started at $e_0=1$.

We can write the inhomogemous solution starting at $e_0=2$ as a cumbersome sum: $$e_r^*=2( \alpha_1 \alpha_2 \dots \alpha_r + \dots + \alpha_r + 1)$$

And the solution to the homogenous equation $e_r=\alpha_r e_{r-1}$ is obviously $$e_r= \prod_{i=1}^r \alpha_r $$ And we can write the solution We have initial conditions $e_0=1$, so we need $e=e^*-e^{homo}$ so $$\sum_{r=1}^n e_r = \sum_{r=1}^n e_r^* - \sum_{r=1}^n e_r^{homo}$$ and $\sum_{r=1}^n e_r^*$ is the sum of $2\prod \alpha_i$ over all runs of consecutive $\alpha_i$ so we can rewrite it as $$\sum_{r=1}^n e_r^*=2\sum_{i=1}^n \sum_{j=i}^n \prod_{k=i}^j \alpha_k$$ And $\sum_{r=1}^n e_r^{homo}$ is the sum of $\prod \alpha_i$ of all runs of consecutive $\alpha_i$ which contain $\alpha_1$.

This means our overall sum is the sum of all products of runs of consecutive $\alpha_i$, where it counts for double if it doesn't include $\alpha_1$.

Now, minimising $\sum d_r$ is the same as minimising $\sum e_r$ and permuting $p_r$ is the same as permuting $\alpha_r$, so our problem is equivalent to finding which arrangement of $\alpha_r$ minimises this sum.

If it took twice as long to move from $1$ to $2$ (and not vice versa) the complication around runs including $\alpha_1$ would disappear. Intuitively, it seems like we'd want to intersperse large and small $\alpha_i$ to prevent consecutive runs of large $\alpha_i$, which could have a very large product.

Very nice! Some of my further attempts came up with the same recurrence relation for $t_r$, which seems fruitful. The final sum/product seems much more amenable to analysis. — whpowell96, Jul 20 '24 at 19:20

Varun Vejalla · Answer 2 · 2024-07-22T21:31:05.233

Using the recurrence in Zoe Allen's answer, we can set up a linear equation $At=b$, where $b$ is the column vector with all $1$s, and $A$ is the tridiagonal $n$ by $n$ matrix given by $A_{1,1}=p_1$, $A_{k,k}=1$ for $2\le k\le n$, $A_{k,k+1}=-p_k$ for $1\le k\le n-1$, and $A_{k,k-1}=-1+p_k$ for $2\le k\le n$. We are then interested in minimizing $t_1$.

By Cramer's rule, this is given by $\frac{\det(M)}{\det(A)}$, where $M$ is $A$ with the first column replaced by $b$ (which is all $1$s). $\det(A)$ comes out to simply $p_1p_2\cdots p_n$, and since this ends up being constant for all permutations, we can just ignore it. $M$ is where the asymmetry comes in. If we denote $D(p_1,\ldots,p_n)$ as $\det(M)$, we can expand the determinant along the last row. Specifically, we have $D(p_1,\ldots,p_n)=\prod_{1\le k\le n-1}p_k+D(p_1,\ldots,p_{n-1})+p_{n-1}(-1+p_n)D(p_1,\ldots,p_{n-2})$, with $D(p_1,p_2)=1+p_1$ and $D(p_1)=1$. It looks like the following holds: $$D(p_1,\ldots,p_n)=\sum_{S\subseteq \{p_1,\ldots,p_n\}}b_S\prod_{p\in S}p$$ with $b_S\in\{-1,0,1\}$, but I wasn't able to do much more with this.

The recurrence actually looks a little better when we recast it in terms of the stopping times: $$T(p_1,\ldots,p_n)=\frac{1}{p_n}+\frac{1}{p_n}T(p_1,\ldots,p_{n-1})-\frac{1-p_n}{p_n}T(p_1,\ldots,p_{n-2})$$, with $T(p_1,p_2)=(1+p_1)/(p_1p_2)$ and $T(p_1)=1/p_1$. In any case, I doubt there is a clean answer or that any sort of greedy algorithm can work, since different relative orderings of the permutations could be best depending on what the exact values of probabilities are (see the code). If this is the case, it might help to clarify what sort of answer you're looking for (e.g. maybe an algorithm that identifies the best permutation in some fast enough time, as my algorithm is very slow at $O(n\cdot n!)$ time since it requires looking at every permutation).

from itertools import permutations
from random import random
def getScore(probs):
    if len(probs) == 1:
        return 1/probs[0]
    oldScore = 1/probs[0]
    pastScore = (1+probs[0])/(probs[0]*probs[1])
    n = 2
    while n < len(probs):
        newScore = 1/probs[n] + 1/probs[n]*pastScore - (1-probs[n])/probs[n] * oldScore
        oldScore = pastScore
        pastScore = newScore
        n += 1
    return pastScore
def getMinimizer(probabilities): #determines what the best permutation, and associated stopping time, is
    bestScore = float("inf")
    bestPerm = probabilities
    for perm in permutations(probabilities):
        score = getScore(perm)
        if score < bestScore:
            bestScore = score
            bestPerm = perm
    return bestPerm, bestScore
probs = (random() for i in range(6))
bestPerm, bestScore = getMinimizer(probs)
ordering = [r[0] for r in sorted(enumerate(bestPerm), key = lambda s:s[1])]
print(ordering) # ordering is different on different runs

Something should be fixed in your code. Check it for $n=2$ to find the error. For each of probs = [0.5, 0.23] and probs = [0.23, 0.5], it returns [0, 1]. — Amir, Jul 23 '24 at 10:41
Not an error in my code, just in my explanation. At the end, it's printing out the relative ordering of the best permutation. Since both are $(0.23, 0.5)$, and $0.23<0.5$, it will print out $[0,1]$. In general, ordering will be the same for any permutation of probs. — Varun Vejalla, Jul 23 '24 at 15:45

RobPratt · Answer 3 · 2024-07-25T15:44:25.957

Here is the mixed integer linear programming formulation I used. For $i\in\{1,\dots,n+1\}$, let nonnegative decision variable $t_i$ be the expected stopping time from state $i$. Let binary decision variable $x_{ij}$ indicate whether $\sigma(i)=j$. The problem is to minimize $t_1$ subject to \begin{align} \sum_{j=1}^n x_{ij} &= 1 &&\text{for $i\in\{1,\dots,n\}$} \tag1\label1 \\ \sum_{i=1}^n x_{ij} &= 1 &&\text{for $j\in\{1,\dots,n\}$} \tag2\label2 \\ t_{n+1} &= 0 \tag3\label3 \\ t_i &= 1 + \left(\sum_{j=1}^n p_j x_{ij}\right) t_{i+1} + \left(1-\sum_{j=1}^n p_j x_{ij}\right) t_{\max(i-1,1)} &&\text{for $i\in\{1,\dots,n\}$} \tag4\label4 \end{align} Constraints \eqref{1} and \eqref{2} enforce that $\sigma$ is a permutation of $\{1,\dots,n\}$. Constraint \eqref{3} is a boundary condition because state $n+1$ is absorbing. Constraint \eqref{4} is the first-step analysis recurrence. This constraint is nonlinear but can be linearized by introducing new continuous variables and linear big-M constraints to enforce the products $x_{ij} t_{i+1}$ and $x_{ij} t_{\max(i-1,1)}$.

As an (often more efficient) alternative to the standard linearization of a product of a binary variable and a bounded variable, you can use constraint \eqref{1} to obtain a "compact linearization" as follows. Introduce nonnegative decision variables $y_{ij}$ and $z_{ij}$ to represent $x_{ij} t_{i+1}$ and $x_{ij} t_{\max(i-1,1)}$, respectively. Multiplying \eqref{1} by $t_{i+1}$ yields $$\sum_{j=1}^n y_{ij} = t_{i+1} \quad\text{for $i\in\{1,\dots,n\}$} \tag5\label5$$ Multiplying \eqref{1} by $t_{\max(i-1,1)}$ yields $$\sum_{j=1}^n z_{ij} = t_{\max(i-1,1)} \quad\text{for $i\in\{1,\dots,n\}$} \tag6\label6$$ To enforce $x_{ij}=0 \implies y_{ij}=0$, impose $$y_{ij} \le M x_{ij} \quad\text{for $i,j\in\{1,\dots,n\}$} \tag7\label7$$ To enforce $x_{ij}=0 \implies z_{ij}=0$, impose $$z_{ij} \le M x_{ij} \quad\text{for $i,j\in\{1,\dots,n\}$} \tag8\label8$$ Finally, replace \eqref{4} with the linear constraint $$t_i = 1 + \sum_{j=1}^n p_j y_{ij} + t_{\max(i-1,1)}-\sum_{j=1}^n p_j z_{ij} \quad\text{for $i\in\{1,\dots,n\}$} \tag9\label9$$

For the four examples in the answer by @Amir, I obtained the same optimal objective values.

+1, Thank you for double checking the results! – Amir Jul 23 '24 at 19:15 — Amir, Jul 23 '24 at 19:15

Amir · Answer 4 · 2024-07-24T07:16:50.660

Using the following recursion given by Varun Vejalla (also presented by Zoe Allen in a different way):

$$T(p_1,\ldots,p_n)=\frac{1}{p_n}+\frac{1}{p_n}T(p_1,\ldots,p_{n-1})-\frac{1-p_n}{p_n}T(p_1,\ldots,p_{n-2}),$$

with $T(p_1,p_2)=(1+p_1)/(p_1p_2)$ and $T(p_1)=1/p_1,$ the optimization problem can be formulated as follows:

$$\min Z=Z_n$$

subject to

$$Z_i=B_i+B_iZ_{i-1}-(B_i-1)Z_{i-2}, \quad i=3,\dots,n$$ $$Z_2=B_2+B_2Z_1,Z_1=B_1$$ $$B_i=\sum_{k=1}^n \frac{1}{p_k} X_{ik}, \quad i=1,\dots,n$$ $$\sum_{k=1}^n X_{ik}=1,\quad i=1,\dots,n$$ $$\sum_{i=1}^n X_{ik}=1,\quad k=1,\dots,n$$ $$Z_{i} \ge Z_{i-1},\quad i=2,\dots,n$$ $$B_i, Z_{i} \ge 0,\quad i=1,\dots,n$$ $$X_{ik} \in \{0,1\}, \quad i=1,\dots,n, k=1,\dots,n.$$

This can be rewritten as follows:

$$\min Z=Z_n$$

subject to

$$Z_i=B_i+\sum_{k=1}^n \frac{1}{p_k} X_{ik} T_i +Z_{i-2}, \quad i=3,\dots,n \tag{1}$$ $$T_i=Z_{i-1}-Z_{i-2}, \quad i=3,\dots,n$$ $$Z_2=B_2+ \sum_{k=1}^n \frac{1}{p_k} X_{2k} Z_1\tag{2}$$ $$Z_1=B_1$$ $$B_i=\sum_{k=1}^n \frac{1}{p_k} X_{ik}, \quad i=1,\dots,n$$ $$\sum_{k=1}^n X_{ik}=1,\quad i=1,\dots,n$$ $$\sum_{i=1}^n X_{ik}=1,\quad k=1,\dots,n$$ $$T_i\ge 0, \quad i=3,\dots,n$$ $$Z_{i} \ge Z_{i-1},\quad i=2,\dots,n$$ $$B_i, Z_{i} \ge 0,\quad i=1,\dots,n$$ $$X_{ik} \in \{0,1\}, \quad i=1,\dots,n, k=1,\dots,n.$$

In the above formulation, (1) and (2) include nonlinear terms, i.e., $X_{ik} T_i$ and $X_{2k} Z_1$. Constraints (1) can be linearized as follows:

$$Z_i=B_i-\sum_{k=1}^n \frac{1}{p_k}W_{ik} +Z_{i-2}, \quad i=3,\dots,n $$

by adding

$$T_i -M (1-X_{ik}) \le W_{ik} \le M X_{ik}, W_{ik}\le T_i$$

$$W_{ik} \ge 0, \quad i=3,\dots,n, k=1,\dots,n$$

where $M$ is a sufficiently large positive constant; (2) can be linearized similarly. By linearization of (1) and (2), the resulting model becomes a mixed-integer linear programming (MILP) model that can be efficiently solved using existing MILP solvers.

Note that some variables such as $B_i,i=1,\dots,n$ can be removed from the above formulation, but I kept them for ease of understanding the model. In fact, further simplification is not required if one uses an advanced MILP solver such as CPLEX because such unnecessary variables and constraints are removed in its pre-processing phase.

Examples

Example 1: For

$$1: 0.5, 2: 0.23$$

we get

$$(1): 0.23, (2): 0.5$$

with the objective function $10.695652173913043.$

Example 2: For

$$1: 0.5, 2: 0.23, 3: 0.25, 4: 0.39, 5: 0.47$$

we get

$$(1): 0.23, (2): 0.39, (3): 0.5, (4): 0.47, (5): 0.25$$

with the objective function $88.84892905429444.$

Example 3: For

$$1: 0.23, 2: 0.23, 3: 0.23, 4: 0.23, 5: 0.23$$

we get

$$(1): 0.23, (2): 0.23, (3): 0.23, (4): 0.23, (5): 0.23$$

with the objective function $1098.5978062387278.$ This agrees with the value $1098.597806238729$ obtained for the case where all $p_i=p,i=1,\dots,n$ are the same, given by

$$ \frac{a \left((a - 1)^{n + 1} - a (n+1) + 2 n + 1\right )}{(a - 2)^2}$$

with $a=\frac{1}{p}$.

Example 4: For

$$1: 0.5, 2: 0.23, 3: 0.25, 4: 0.39, 5: 0.47, 6: 0.27, 7: 0.61, 8: 0.91, 9: 0.55, 10: 0.04$$

we get

$$(1): 0.04, (2): 0.61, (3): 0.25, (4): 0.91, (5): 0.27, (6): 0.55, (7): 0.5, (8): 0.47, (9): 0.39, (10): 0.23$$

with the objective function $392.6970597294377.$

PS 1: The above examples show that the following statement in the OP does not necessarily hold:

Maximizing $p_{\sigma(1)}$ seem would likely minimize the time spent stuck at this state in long paths, lowering the overall expectation.

In fact, in all examples I solved so far, a number of them given above, the smallest probability is assigned to the first move, which is somehow counterintuitive.

PS 2: The above examples show that the optimal sequence may increase and decrease several times (see Example 4), that is, it is not necessarily monotone or even unimodal. A pattern that I observed is that it is first increasing and is decreasing at the end.

This is different than the MILP formulation I used, and I didn’t check the details of yours, but don’t you also need $\sum_i X_{ik}=1$? — RobPratt, Jul 23 '24 at 00:55
@RobPratt Thanks, yes we need both similar to the assignment problem. — Amir, Jul 23 '24 at 01:02
+1. I obtained the same results from my MILP formulation, which I will post separately. — RobPratt, Jul 23 '24 at 18:40
Great and thorough answer! I appreciate you giving the details of linearizing the MILP constraint. Given the discussion in the comments and the other answers to this question, I suspect that this is the farthest this question can be taken in general and will award this answer the bounty. I appreciate all of the great discussion and collaboration in the answers to this question. — whpowell96, Jul 23 '24 at 19:50

Optimal permutation of transition probabilities in random walk to minimize expected stopping time

Background

Problem

Some Thoughts

Edit

Final Edit:

4 Answers4