1

Stationary distributions are left eigenvectors for transition matrix of a markov chain.

Does anyone has a good understanding of why stationary distribution is a left eigenvector ? I'm searching for a geometric evidence, and links between graph theory and linear algebra. I learned about eigenvectors as a vector of symetry, one that will remain fix. However, I can't yet plot a markov graph as vectors.

I have been trying to do change of basis on matrices and see what happens on the graph of transition but this gives me nothing.

Do you have any insight on that question?

good related question : connection between graphs and the eigenvectors of their matrix representation

Marine Galantin
  • 3,076
  • 1
  • 19
  • 39

2 Answers2

1

Let $P$ be the $n \times n$ matrix of transition probabilities, and let $\mu = (\mu_1,\dots,\mu_n)$ be a probability distribution.

The key is to note that given an initial probability distribution $\mu$, the entries of $\mu^T P$ are the probabilities that we end up in each state after taking one step in the chain (given that we chose the initial state according to the distribution $\mu$). In particular, $$ \Bbb P(\text{land in state j}) = \sum_{i=1}^n \Bbb P(\text{start in state }i) \cdot \Bbb P(\text{transition from $i$ to $j$}) =\\ \sum_{i=1}^n \mu_i \cdot P_{ij} $$ which is indeed the $j$th entry of $\mu^TP$.

By definition, a stationary distribution is one that remains unchanged after taking one step (and therefore arbitrarily many steps) in the chain. That is, $\mu$ is a stationary distribution if and only if $\mu^TP = \mu$.

Ben Grossmann
  • 234,171
  • 12
  • 184
  • 355
0

I’m not sure what it is that you mean by “plot a markov graph as vectors.” The way that vectors enter into the picture is that when you have a finite number $n$ of states, the probabilities that the system is in a particular state at time $k$ can be collected into a state vector: a row vector $\mathbf\pi_k\in[0,1]^n$ with the additional constraint that the sum of the elements of $\mathbf\pi_k$ is $1$. In a discrete-time process, successive state vectors are related by a transition matrix $P$ such that $\mathbf\pi_{k+1}=\mathbf\pi_kP$. Geometrically, these state vectors all lie on a hyperplane at a distance of $1/\sqrt n$ from the origin.

A stationary distribution of $P$ is simply a state vector $\mathbf\pi$ that remains the same after a transition, i.e., $\mathbf\pi P=\mathbf\pi$. In other words, it’s a fixed point of the transformation represented by $P$. The above equation is just an instance of the general eigenvector equation $\mathbf v P=\lambda\mathbf v$ with $\lambda=1$, so a stationary distribution of the process represented by the transition matrix $P$ is a left eigenvector of $P$ with eigenvalue $1$. Fundamentally, an eigenvector of a matrix corresponds to a line that is mapped to itself by the transformation that the matrix represents. As to the stationary distribution’s being a left eigenvector, that’s just an artifact of using row vectors. Other sources use column vectors instead, and there stationary distributions are, naturally, right eigenvectors instead.

amd
  • 55,082
  • Have you ever heard about finding stationary distribution by finding the intersection between the hyperplane you mentionned and the sphere of radius $ 1/ \sqrt n $ ? How is the hyperplan defined ? – Marine Galantin Oct 23 '19 at 13:47
  • and do you have an interpretation of a change of basis when one looks at the graph ? – Marine Galantin Oct 23 '19 at 13:56
  • The elements of a state vector must sum to $1$. That’s a linear equation in its elements, which defines a hyperplane that they must all lie on. Since the elements must be nonnegative, they’re further restricted to a convex region of this hyperplane. The sphere you mention is tangent to this hyperplane at $\frac1{\sqrt n}(1,1,\dots,1)$, so intersecting it with that hyperplane seems like a fruitless approach to finding stationary distributions. – amd Oct 23 '19 at 18:40