8

In his paper "A Probabilistic Algorithm for $k$-SAT and Constraint Satisfaction Problems", Schöning gives a randomized algorithm for $k$-SAT. The analysis conditions on the Hamming distance between a fixed true assignment $a^{*}$ and the initial guessed assignment $a$, say $j$. In every step we flip the value of a variable from a violated clause, so the distance is decreased by 1 with probability at least $\frac{1}{k}$, and increased by 1 with probability at most $1-\frac{1}{k}$. It is then suggested to think of the process as a Markov chain with states $0,1,...,n$ indicating the Hamming distance.

My first question is, why is it a Markov chain? It seems that the transition probabilities do not depend on the current distance only (but also the current assignment), even though they are surely bounded by $\frac{1}{k}$ and $1-\frac{1}{k}$.

Next, the probability of success (reaching from state $j$ to state $0$) is estimated by a walk with $i$ steps in the "wrong" direction (flips which increase the Hamming distance), and $i+j$ "right" steps (reducing the distace to $0$ after $2i+j$ steps). This gives $$ q_j \geq \sum_{i=0}^j \binom{j+2i}{i}\cdot\frac{j}{j+2i}\cdot\left(1-\frac{1}{k}\right)^i\cdot\left(\frac{1}{k}\right)^{i+j},$$ the first 2 factors count all such walks in the Markov chain, and the last are the transition probabilities.

My second question is about these probabilities - why can we simply substitute the bounds $\frac{1}{k}$ and $1-\frac{1}{k}$? For example, if the probability $p$ of a "right" move (correct flip) happens to be very close to 1, then $(1-p)^i\cdot p^{i+j}$ is actually smaller. I guess that in this case the success probability is high anyway, but I'll be glad to see a formal proof for this bound.

Tommy1234
  • 468

2 Answers2

2

This is just to elaborate on Daniel's answer a bit. The main concern was, apparently, the independence of steps in the coupled process. However, for the lower bound it does not matter. Indeed, suppose that after several steps we got some distribution on the set of assignments of true-false values of the variables (this is where the process is, indeed, a Markov chain: the probability to switch from the current assignment $X$ to some new one $X'$ is determined only by the current assignment itself, not by the whole way in which we obtained it) and we know that for this distribution, the probability $P[d(X,X_0))\le m]\ge P[Y\le m]$ for all $m\ge 0$ where $X$ is the current assignment, $X_0$ is the assignment satisfying the formula, and $Y$ is the comparison random walk on $\mathbb Z_+$ (note that the estimate you wrote makes comparison with the random walk on all nonnegative integers rather than with that on $\{0,\dots,n\}$). What we want to show is merely that this inequality holds for the next step, which is straightforward (I'll write just $d(X)$ instead of $d(X,X_0)$): $$ P[d(X')\le m]\ge P[d(X)\le m-1]+\frac 1k P[d(X)\in\{m,m+1\}] \\ =\left(1-\frac 1k\right)P[d(X)\le m-1]+\frac 1kP[d(X)\le m+1] \\ \ge \left(1-\frac 1k\right)P[Y\le m-1]+\frac 1kP[Y\le m+1] \\ =P[Y\le m-1]+\frac 1k P[Y\in\{m,m+1\}]=P[Y'\le m]. $$ The first inequality here merely says that if the distance was at most $m-1$, it will remain at most $m$, and if it was $m$ or $m+1$, then it will have the chance of at least $1/k$ to become at most $m$ by going to the left and the last equality reflects the same property of the random walk on $\mathbb Z_+$, while the middle inequality is just our "induction assumption". To be honest, the formula should be changed a bit when $m=0$ because the absorbing state $0$ has its special transition probabilities in both the assignment Markov chain and the comparison one but I leave this small detail to you. It is trivial here, but it becomes important if you decide to compare with the random walk on $\{0,\dots,n\})$, in which case this simple induction will work only if you assign the transition probabilities for the state $n$ in the comparison Markov chain as $P[n\to n-1]=\frac 1k,\ P[n\to n]=1-\frac 1k$.

fedja
  • 19,348
1

Here's my (incomplete) understanding of this.

The process defined by the algorithm is indeed not a Markov chain. What happens is that we run a random walk, "simultaneously with the algorithm", that steps in the "right" direction with probability $1/k$ and in the "wrong" direction with probability $1 - 1/k$. Intuitively, we would expect the probability that this random walk ends up in the correct state to give a lower bound on the probability that the algorithm runs correctly, since at each step the algorithm is more likely to make the right step than the random walk. As I understand, the author takes this intuition and runs with it, so that the rest of the analysis is only concerned with the random walk and not the algorithm itself.

I would expect that one can formalize this intuition by using some sort of coupling to make sure that at every step the algorithm is indeed closer to the correct state than the random walk. Unfortunately, I don't know how this can be done precisely. The point I'm unsure about is the following: let $X_i$ be $i$-th step made by the algorithm for $i = 1, \dots, 3n$. Then, as in the Wikipedia example on biased coins, we can make random variables $Y_i$, representing the $i$-th step of the random walk, such that if $X_i$ is a step in the wrong direction then so is $Y_i$ and the distribution of $Y_i$ is the one we would like. However, for the analysis we would also like the $Y_i$-s to be independent of each other. Given that the $X_i$-s are not independent, I don't know whether this can be done.

I hope someone with more knowledge of these notions can give the details. I have glanced at a few sources describing this algorithm but, interestingly, none of them discussed this point.

Dániel Garamvölgyi
  • 5,020
  • 14
  • 25