Computing Lyapunov Exponents of an Example of Avila and Bochi

Question

In Artur Avila and Jairo Bochi's lecture notes (see here: http://mat.puc-rio.br/~jairo/docs/trieste.pdf) in section 3.1 they deal with Lyapunov exponents of products of random i.i.d. matrices. Let $\{Y_{i}\}$ be a squence of random independent matrices in $SL(2,\mathbb{R})$ (i.e. real matrices 2x2 with determinant equal to 1) with distribution $\mu$, the Lyapunov exponent is the non-negative number $\lambda$ such that

$$ \lambda = \limsup_{n \to \infty} \frac{\log\lVert Y_{n}Y_{n-1}\cdots Y_{1}\rVert}{n} $$ a.e.

Example 3.3 says that if the only matrices that occur are

$$ \begin{pmatrix} 2 & 0 \\ 0 & \frac{1}{2} \\ \end{pmatrix}\quad R_{\pi/2} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \\ \end{pmatrix} $$ it is a simple exercise to show that $\lambda = 0$, but I was not able to do it, what I got so far was:

First, since I think there is nothing special in the number two, I considered the matrices as

$$ A_{1} = \begin{pmatrix} \sigma & 0 \\ 0 & \sigma^{-1} \\ \end{pmatrix}\quad A_{2} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \\ \end{pmatrix} $$ a straightforward computation gives $A_{1}A_{2}A_{1} = A_{2}$, so intuitively speaking, even if there is a lot of $A_1$'s in the product, one $A_2$ can eliminate a lot of them by this relation, so it makes sense that the matrix norm does not increase a lot, but trying to reason it assuming $A_i$ occur with probability $p_i > 0$ got me stuck, I tried looking at it in a combinatorial way but it got too complicated (at least from my point of view). Could you give me any idea on how to proceed?

Severin Schraven · Answer 1 · 2024-04-23T19:32:49.547

This is not a real answer. Just a very long comment. Note that we can write $$ \Vert Y_N Y_{N-1} \dots Y_1 \Vert = \Vert \prod_{j=1}^L A_1^{n_j} A_2^{m_j} \Vert$$ as $A_2$ is a unitary. Furthermore, we have $A_2^2=-Id$ and thus, we can pick $m_j=1$. Using your relation we get \begin{align*} \Vert A_1^{n_1} A_2 A_1^{n_2} A_2 \dots A_1^{n_L} \Vert = \Vert A_1^{n_1} A_2 A_1^{n_1} A_1^{n_2-n_1} A_2 \dots A_1^{n_L} \Vert = \Vert A_1^{n_2-n_1} A_2 A_1^{n_3} A_2 \dots A_1^{n_L} \Vert, \end{align*} where we have used again that $A_2$ is unitary (and therefore does not change the operator norm). By induction we get $$ \Vert A_1^{n_1} A_2 A_1^{n_2} A_2 \dots A_1^{n_L} \Vert = \Vert A_1^{n_L-n_{L-1}+n_{L-2}\mp \dots (-1)^{L+1} n_1}\Vert. $$ Thus, we get \begin{align*} 0\leq \limsup_{N\rightarrow \infty} \frac{\log(\Vert Y_N Y_{N-1} \dots Y_1 \Vert)}{N} \leq \max\{ \vert \sigma \vert, \vert \sigma\vert^{-1}\} \limsup_{L\rightarrow \infty} \frac{\sum_{j=1}^L (-1)^{L-j} n_j}{L-1+\sum_{j=1}^L n_j}. \end{align*} Now one would need to prove that the cancellations are sufficiently large such that the fraction converges to zero almost surely. I'll leave this to somebody who actually knows a bit of probability theory.

Intuitively this should work out. Essentially we can start with a $+1$ and keep adding $+1$ as long we have $A_1$ and as soon as we get a $A_2$ we add $-1$ and keep doing this as long as we get $A_1$. One could try to use some Markov chain theory. Namely, one could consider the Markov chain with transition matrix $$\begin{pmatrix} p & 1-p \\ 1-p & p\end{pmatrix}$$ on the state space $\{-1, 1\}$ (here $p$ is the probability that we get $A_1$, i.e. not changing the state). Then we can use the strong law of large numbers for Markov chains to get pointwise convergence almost surely to the average (which is zero in this case). One would still need to check that the measure of the Markov chain plays nicely with the actual probability measure on the $A_i$.

Ben · Accepted Answer · 2024-04-26T18:50:01.637

At every step of the process the product of random matrices can be written as $A^{n_1}_1A_2^{n_2}$ because of the property you mention $A_1A_2A_1=A_2$. To see this, we simply compute $$ A_1(A^{n_1}_1A_2^{n_2}) = A^{n_1+1}_1A_2^{n_2},\qquad A_2(A^{n_1}_1A_2^{n_2})=A^{-n_1}_1A_2^{n_2+1}. $$ Let $\sigma>1$, then it easy to see that the operator norm is given by $$ \|A^{n_1}_1A_2^{n_2}\| = \|A^{n_1}_1\| = \sigma^{|n_1|}. $$ The expression for the Lyapunov exponent is $$ \frac{\log \|A^{n_1}_1A_2^{n_2}\|}{n} = \frac{|n_1|}{n}\log\sigma . $$ The $n\rightarrow \infty$ limit is zero because the dynamics of $|n_1|$ is controlled by a random walk with $|n_1|\sim \sqrt{n}$.

To elaborate let $p_1$ be the probability of multiplying by $A_1$ and $p_2$ for $A_2$, then we can consider a random walk $X_n$ that increases $X_{n+1}=X_n+1$ with probability $p_1$ or goes to $X_{n+1}=-X_n$ with probability $p_2$. Then we can identify $n_1$ and $X_n$. It is not hard to show that $\mathbb E(X_n^2) \sim n$ and then you can use Markov's inequality to show that $P(\frac{|X_n|}{n}>\epsilon)\rightarrow 0$ for all $\epsilon>0$.

I don't understand the last step, could you elaborate more on what random walk are you refering to and why is $|n_1|$ controlled by it? — Raul Fernandes Horta, Apr 26 '24 at 18:02

User203940 · Answer 3 · 2024-05-06T15:50:14.043

Edit: I've decided to flesh this out to motivate Furstenberg's theorem. This is not the ideal solution, but it is a classical answer to the problem (see https://link.springer.com/article/10.1007/BF00537227)

Consider the space $X := \{1,2\}^\mathbb{Z}$ and let $\sigma :X \rightarrow X$ be the usual left shift map. Define

$$ A : X \rightarrow \text{SL}(2,\mathbb{R}), \ \ A(x) := \begin{cases} A_1 &\text{ if } x_0 = 1 \\ A_2&\text{if } x_0 = 2. \end{cases} $$

This generates a cocycle as follows:

$$ A : X \times \mathbb{Z} \rightarrow \text{SL}(2,\mathbb{R}), \ \ A(x,n) := A(\sigma^{n-1}(x)) \cdots A(x) \text{ if } n > 0,$$

and you take the appropriate definition for $n < 0$. A constant $0 < p < 1$ generates a probability measure $\mu$ on $X$ in the usual sense; for example, if I fix coordinates of $x$ with $n$ $1$'s and $m$ $2$'s, then the probability of such a sequence occurring is $p^n (1-p)^m$.

Define

$$ \lambda : X \times \mathbb{R}^2 \rightarrow \mathbb{R}, \ \ \lambda(x,v) := \limsup_{n \rightarrow \infty} \frac{1}{n}\log(\|A(x,n)v\|).$$

Oseledets' theorem tells us that if $\lambda > 0$ then there is a distribution $x \mapsto E^-_x$ so that for almost every $x \in X$ the following holds: $$\lambda(x,v) = \begin{cases} \lambda &\text{if } v \in \mathbb{R}^2 \setminus E^-_x\\ - \lambda &\text{if } v \in E^-_x \setminus \{0\}.\end{cases}$$ This distribution is nice, in the sense that it is invariant and measurable. Since every non-zero point in this distribution gives the same value, this suggests that instead of thinking about the vector space, we should maybe think about projective space (or the space of directions). For $v,w \in \mathbb{R}^2 \setminus \{0\}$, we write $v \sim w$ if $v = \lambda w$ with $\lambda \in \mathbb{R}$. Define projective space by

$$ \mathbb{P}^2 := (\mathbb{R}^2 \setminus \{0\})/\sim.$$

Abusing notation, it is clear that our matrices act on projective space by $A_i([v]) := [A_i(v)]$, where here $[v] := \{w \in \mathbb{R}^2 \ | \ v \sim w\}.$ Now, if we abuse Oseledets' theorem, we can see that our Lyapunov exponent $\lambda$ is really a function on $X \times \mathbb{P}^2$, and instead of a distribution we're really thinking about a "nice" function $p_- : X \rightarrow \mathbb{P}^2$ which determines a direction that corresponds to $-\lambda$ (and to make it worse, all of this is almost everywhere). To be precise, we now have $$ \lambda : X \times \mathbb{P}^2 \rightarrow \mathbb{R}, \ \ \lambda(x,p) := \limsup_{n \rightarrow \infty} \frac{1}{n}\log \left(\|A(x,n)v\| \right) \text{ for some } v \in p.$$ It is an easy exercise to show it is well-defined, independent of the choice of $v \in p$.

Let $p_1, p_2 \in \mathbb{P}^2$ correspond to the eigenspaces of $A_1$. Notice that $\{p_1, p_2\}$ is the only invariant set for the cocycle $A$, so if we are going to have our distribution $p_-$ from earlier, then it must live in here. Notice also that we have the relations $$A_1(p_1) = p_1, \ A_1 (p_2) = p_2, \ A_2(p_2) = p_1, \ A_2 (p_1) = p_2.$$ Let $C_i := \{x \in X \ | \ x_0 = i\}$, so $X = C_1 \sqcup C_2$.

Now, if $x \in C_1$, then $$ \lambda(x,p_1) = \limsup_{n \rightarrow \infty} \frac{1}{n}\log \left( \|A(\sigma(x), n-1) A_1 v\|\right) \text{ for some } v \in p_1.$$

Say $p_1$ is the eigenspace corresponding to $\sigma$, so $A_1 v = \sigma v$. This gives us

$$ \lambda(x,p_1) = \limsup_{n \rightarrow \infty} \frac{1}{n}\log \left( \sigma \|A(\sigma(x), n-1) v\| \right) \text{ for some } v \in p_1.$$

Simplifying this yields $\lambda(x, p_1) = \lambda(\sigma(x), p_1).$ A similar argument yields that if $x \in C_2$, then $\lambda(x, p_1) = \lambda(\sigma(x), p_2)$. Using a conditioning trick (i.e. noting $\lambda$ relies only on future coordinates), we have

$$ \int_X \lambda(x,p_1) d\mu(x) = \int_{C_1} \lambda(x, p_1) d\mu(x) + \int_{C_2} \lambda(x, p_1) d\mu(x) \\ = \int_{C_1} \lambda(\sigma(x), p_1) d\mu(x) + \int_{C_2} \lambda(\sigma(x), p_2) d\mu(x) \\ = p\int_{X} \lambda(x,p_1) d\mu(x) + (1-p)\int_{X} \lambda(x, p_2) d\mu(x).$$ We conclude that $$\int_X \lambda(x,p_2) d\mu(x) = \int_X \lambda(x,p_1) d\mu(x)$$ as long as $0 < p < 1.$ Supposing now that $\lambda > 0$, we have that there is a set of full measure so that $\lambda(x, p_-(x)) = - \lambda$. On the other hand, we noted earlier that $p_-(x) \in \{p_1, p_2\}$. Let $E_i = \{x \in X \ | \ p_-(x) = p_i\}$, and note that $E_1 \sqcup E_2 = X$. We have

$$ (1- 2\mu(E_1)) \lambda = \int_X \lambda(x, p_1) d\mu(x) = \int_X \lambda(x, p_2) d\mu(x) = (1 - 2\mu(E_2))\lambda.$$

Rearrange to get

$$ \mu(E_1) = 1/2 = \mu(E_2)$$

as long as $\lambda \neq 0$. So, half the time $p_-(x)$ is equal to $p_1$, and the other half it is equal to $p_2$. But notice that this implies

$$ \int_X \lambda(x,p_1) d\mu(x) = \int_{E_1} \lambda(x,p_1) d\mu(x) + \int_{E_2} \lambda(x, p_2) d\mu(x) = 0.$$

On the other hand,

$$ - \lambda = \int_X \lambda(x, p_-(x)) d\mu(x) = \frac{1}{2}\int_{X} \lambda(x, p_1) d\mu(x) + \frac{1}{2}\int_X \lambda(x, p_2)d\mu(x) = 0.$$

Notice that in some sense this recreates the stationary measure calculation.

Thanks, the problem with this approach and the one used in the pdf is that it seems to be using a lot of theorems that in the lecture notes are not mentioned previously, so it seems to have a more elementary way to do it, in Viana's book "Lectures on Lyapunov Exponents" he also puts this as an exercise (it is in the page 4 of the book, so it seems he expects you to do it with no previous knowledge) — Raul Fernandes Horta, Apr 25 '24 at 01:43

Computing Lyapunov Exponents of an Example of Avila and Bochi

3 Answers3