I often find myself forgetting how to prove that every square matrix having a Schur factorization because I never really understood the motivation behind the steps, I only memorized how to do it. I would like to rectify this. Here's the proof I am familiar with:
We proceed by induction. Suppose every $n \times n$ matrix has a Schur decomposition and take an $(n+1) \times (n+1)$ square matrix $A$ and write $$A = \begin{bmatrix} a & v^*\\ u & B \end{bmatrix}$$ where $a \in \mathbb{C}$, $u, b$ are vectors, and $B$ is $n \times n$. Let $v_1$ be a normalized eigenvector of $A$ with corresponding eigenvalue $\lambda$ and extend $v_1$ to an orthonormal basis $v_1, ..., v_{n+1}$ of $\mathbb{C}^{n+1}$. Define the matrix $$V = \begin{bmatrix}v_1 \ \ v_2 \ \ \cdots \ \ v_{n+1} \end{bmatrix} = \begin{bmatrix} v_1 \ \ V_2\end{bmatrix}.$$ We have $$VAV^* = \begin{bmatrix} v_1^* \\ V_2^*\end{bmatrix} A \begin{bmatrix} v_1 \ \ V_2 \end{bmatrix} = \begin{bmatrix} \lambda & v_1^*AV_2\\ \lambda V_2^*v_1 & V_2^*AV_2 \end{bmatrix} = \begin{bmatrix} \lambda & w^*\\ 0 & C \end{bmatrix}$$ where $w, C$ are defined in the obvious way. Now $C$ is $n \times n$, so it has a Schur decomposition $QUQ^*$. Thus $$VAV^* = \begin{bmatrix} 1 & 0^T\\ 0 & Q \end{bmatrix} \begin{bmatrix} \lambda & w^*\\ 0 & U \end{bmatrix} \begin{bmatrix} 1 & 0^T\\ 0 & Q^* \end{bmatrix} $$ The middle matrix on the right is upper triangular, so solving for $A$ above shows a Schur Decomposition of $A$.
My Analysis of the Proof:
The proof starts by induction which I can accept, as that is a standard technique in Linear Algebra. Then it's natural (but wrong) to write $B$ as its Schur factorization and try to factor out a unitary matrix $K$ on the left and its adjoint $K^*$ on the right; but this doesn't quite work. Instead, we have to conjugate $A$ by a strange unitary matrix $V$ (in which only the first column really matters, not the others) and only THEN the above trick works.
It seems like the key step is to multiply $A$ by this strange matrix $V$. What is the motivation behind this matrix? How could we have known that conjugating $A$ by $V$ will allow us to factor $VAV^*$ in a useful way?
Thank you very much.