Motivation behind the key step in proof of the Schur decomposition

Question

I often find myself forgetting how to prove that every square matrix having a Schur factorization because I never really understood the motivation behind the steps, I only memorized how to do it. I would like to rectify this. Here's the proof I am familiar with:

We proceed by induction. Suppose every $n \times n$ matrix has a Schur decomposition and take an $(n+1) \times (n+1)$ square matrix $A$ and write $$A = \begin{bmatrix} a & v^*\\ u & B \end{bmatrix}$$ where $a \in \mathbb{C}$, $u, b$ are vectors, and $B$ is $n \times n$. Let $v_1$ be a normalized eigenvector of $A$ with corresponding eigenvalue $\lambda$ and extend $v_1$ to an orthonormal basis $v_1, ..., v_{n+1}$ of $\mathbb{C}^{n+1}$. Define the matrix $$V = \begin{bmatrix}v_1 \ \ v_2 \ \ \cdots \ \ v_{n+1} \end{bmatrix} = \begin{bmatrix} v_1 \ \ V_2\end{bmatrix}.$$ We have $$VAV^* = \begin{bmatrix} v_1^* \\ V_2^*\end{bmatrix} A \begin{bmatrix} v_1 \ \ V_2 \end{bmatrix} = \begin{bmatrix} \lambda & v_1^*AV_2\\ \lambda V_2^*v_1 & V_2^*AV_2 \end{bmatrix} = \begin{bmatrix} \lambda & w^*\\ 0 & C \end{bmatrix}$$ where $w, C$ are defined in the obvious way. Now $C$ is $n \times n$, so it has a Schur decomposition $QUQ^*$. Thus $$VAV^* = \begin{bmatrix} 1 & 0^T\\ 0 & Q \end{bmatrix} \begin{bmatrix} \lambda & w^*\\ 0 & U \end{bmatrix} \begin{bmatrix} 1 & 0^T\\ 0 & Q^* \end{bmatrix} $$ The middle matrix on the right is upper triangular, so solving for $A$ above shows a Schur Decomposition of $A$.

My Analysis of the Proof:

The proof starts by induction which I can accept, as that is a standard technique in Linear Algebra. Then it's natural (but wrong) to write $B$ as its Schur factorization and try to factor out a unitary matrix $K$ on the left and its adjoint $K^*$ on the right; but this doesn't quite work. Instead, we have to conjugate $A$ by a strange unitary matrix $V$ (in which only the first column really matters, not the others) and only THEN the above trick works.

It seems like the key step is to multiply $A$ by this strange matrix $V$. What is the motivation behind this matrix? How could we have known that conjugating $A$ by $V$ will allow us to factor $VAV^*$ in a useful way?

Thank you very much.

@runway44 Right I thought it has something to do with that, but I'm not sure how to articulate it. To be honest I don't really have an intuition for change of basis; that's also something I memorized like a list of facts: similar matrices have the same spectrum, the same rank, etc. — Helix, Apr 14 '21 at 18:59
it may be prudent to ignore the requirement of $V$ being unitary. The technique is over some algebraically closed field, (or any field where the char poly has all n roots / splits linearly) then every matrix has at least one eigenvector, so attack said eigenvector $A\mathbf x = \lambda \mathbf x$, then extend to a basis / invertible matrix $X$. Then recurse/use induction hypothesis. The result is $A=STS^{-1}$ and since you are in $\mathbb C$, run $QR$ factorization on $S$ to recover Schur's unitary triangularization. — user8675309, Apr 15 '21 at 19:24

score 1 · Answer 1 · answered Apr 15 '21 at 09:14

I'm using your notation.

Let $F=\mathbb{C}v_1$. Then $\mathbb{C}^{n+1}=F\oplus F^\perp$, where $F^\perp$ has dimension $n$.

So, if you glue $v_1$ and an orthonormal basis of $F^\perp$, you will get an orthonormal basis $\mathcal{B}$ of $\mathbb{C}^{n+1}$. Now the main point is that $Av_1=\lambda v_1$ So if we call $u$ the endomorphism represented by $A$ in the canonical basis, we have $u(v_1)=\lambda v_1$. So , by definition of a representative matrix, $Mat(u; B)=\begin{pmatrix}\lambda & w^* \cr 0 & C\end{pmatrix}$ (since $u(v_1)=\lambda v_1+0 v_2+\cdots+0v_{n+1}$).

If $V$ is the base change matrix, then $V$ is orthogonal (since it's a base change matrix between two orthonormal bases), and the base change formula tells you that $V^{-1}AV=V^*AV =\begin{pmatrix}\lambda & w^* \cr 0 & C\end{pmatrix}$ (The fact that you have $V^*$ on the right is strange, btw).

All in all, as runway44 said, this is just a base change matrix, plus the following general principle of linear algebra:

let $E$ be a $K$-vector space of finite dimension, let $u$ be an endomorphism of $E$. Assume that $E=F\oplus G,$ with $u(F)\subset F$. Glue a basis $B_1$ of $F$ and a basis $B_2$ of $G$ to get a basis $B$ of $E$. Then $Mat(u; B)=\begin{pmatrix}A& A' \cr 0 & C\end{pmatrix}$.

Motivation behind the key step in proof of the Schur decomposition

1 Answers1