6

A Markov transition matrix has all nonnegative entries and so by the Perron-Frobenius theorem has real, positive eigenvalues. In particular the largest eigenvalue is 1 by property 11 here. Furthermore in these notes (sec 10.3) it says that the eigenvalues of $P$ are $1 = \lambda_1 > \lambda_2 \geq \dots \geq \lambda_N \geq -1$.

But how can a transition matrix $P$ have eigenvalues less than 1? Since the matrix is acting on probability distributions $v$, which have to have $\sum_i v_i = 1$, we cannot have $Pv = cv$ with $c\neq 1$ since $\sum_i cv_i = c$.

blue_egg
  • 2,333
  • 5
    The matrix as we use it in context might only act on probability distributions, however the matrix as a linear transformation on $\Bbb R^n$ acts on many more vectors than those that are so heavily restricted as to be probability distributions. We might yet have $Pv=cv$ with $c\neq 1$ in the case that $v$ was not a probability distribution. – JMoravitz Apr 28 '21 at 13:20
  • Your argument fails if you consider 2 distributions, $v_1 ; v_2$, both positive with sum = 1, then consider $P\cdot(v_1 + \epsilon(v_2 - v_1))$ which still sums to 1, however $P \cdot (v_2 - v_1)$ does not sum to $1$, and thus may have a non trivial e.v. – user619894 Apr 28 '21 at 13:22

3 Answers3

7

Maybe a good starting point would be to look at a simple example. The eigenvalues of the transition matrix $\begin{bmatrix} \frac{1}{2} & 1 \\ \frac{1}{2} & 0\end{bmatrix}$ are:

  • $1$, with eigenvector $\begin{pmatrix} \frac{2}{3} \\ \frac{1}{3} \end{pmatrix}$ (the principal eigenvalue giving the stationary Markov distribution)
  • $-\frac{1}{2}$, with eigenvector $\begin{pmatrix} -1 \\ 1\end{pmatrix}$.

Of course, the second one cannot be a probability distribution no matter how we renormalize it, because it has negative entries!

Micah
  • 38,733
  • Makes perfect sense. So I know the second-largest eigenvalue tells us about the convergence rate to the stationary distribution, right? Despite the corresponding eigenvector not being probability distribution, is there some interpretation of this in terms of probability distributions? – blue_egg Apr 28 '21 at 13:25
  • 1
    Absolutely! In this case, we can write any probability vector $v$ in the form $v=v_1+c_2v_2$ where $v_1$ and $v_2$ are the eigenvectors above. Then the $n$th iteration of the Markov process will give us the vector $M^nv=v_1+c_2\left(-\frac{1}{2}\right)^nv_2$. So $v_2$ (the "second-largest" eigenvector) gives us the direction from which the probability vector is approaching the stationary distribution. Something similar applies in general (though you have to be a bit careful because not all Markov matrices are diagonalizable) – Micah Apr 28 '21 at 13:36
  • (cite for the last claim that didn't fit inside the character limit: https://math.stackexchange.com/questions/332659/example-of-a-markov-chain-transition-matrix-that-is-not-diagonalizable) – Micah Apr 28 '21 at 13:36
3

The eigenvectors corresponding to the non-one eigenvalues simply do not correspond to probability distributions; they have both negative and positive entries. For example, $$\begin{bmatrix}0.1&0.9\\0.9&0.1\end{bmatrix}$$ has an eigenvalue of $-0.8$ corresppnding to the eigenvector $(1,-1)$.

Parcly Taxel
  • 105,904
1

The $n$-vectors representing a probability distribution live in a $n$-space.
Since their components sum to 1, they represent a point on the diagonal plane $x_1+x_2+ \cdots + x_n=1$ .
So, if one eigen-vector lies on the plane, and within the positive "octant", the others (being independent, i.e. normal to each other) cannot all lie in that octant, apart from the case in which they correspond to the base vectors $(0,0,\cdots, 1,0, \cdots)$.

G Cab
  • 35,964