2

I studied the definitions of Schur's Decomposition, Cayley Hamilton Theorem and Characteristic polynomial from the book Horn & Johnson. For every definition and proof they have used a square matrix $A \in M_n$ where $M_n(\mathbb{C})$ is abbreviated as $M_n$.

My question is this - from the fundamental theorem of algebra and in the links provided here in wiki I will always get $n$ roots and hence can prove the Schur's Decomposition from which I can go to the Cayley Hamilton Theorem. However, in the matrix $A \in M_n(\mathbb{R})$ where $\mathbb{R}$ is the real field and not algebraically closed, does the above statements hold true also. If so how do I go about proving it ?

I saw a very similar question here at math.stackexchange, but it confused me more.

Any help would be kindly appreciated.

roni
  • 197

1 Answers1

2
  1. As you have noted, $\mathbb R$ is not algebraically closed. Therefore real matrices don't always have Schur decomposition over $\mathbb R$. There is, however, an analogous decomposition over $\mathbb R$ that turns a matrix into a block-upper triangular form, in which every diagonal block is either a trivial block (i.e. a real scalar) or a $2\times2$ block $\pmatrix{a&-b\\ b&a}$ that represents a conjugate pair of nonreal eigenvalues $a+bi$. You may read Horn & Johnson about that.
  2. Using that analogous decomposition, you may prove Cayley-Hamilton pretty much like you do in the complex case. Note that, as the aforementioned $2\times2$ block lumps together a conjugate pair of eigenvalues, it is no longer annihilated by a single linear factor $x-\lambda$ but by the product of a pair of linear factors $(x-a-bi)(x-a+bi)$.
  3. There is, however, no need to consider a decomposition over $\mathbb R$ in the first place. We are talking about the same polynomial over $\mathbb R$ and $\mathbb C$. If $p(A)=0$ over $\mathbb C$, we must have $p(A)=0$ over $\mathbb R$, so it suffices to consider the complex field only.
  4. To prove Cayley-Hamilton theorem over $\mathbb C$, it's actually easier to observe that the theorem is trivial for diagonalisable matrices and diagonalisable matrices are dense in $M_n(\mathbb C)$. Hence, by continuity, the theorem holds for every complex square matrix.
  5. In general, suppose you want to prove an algebraic identity of the form $f(a_1,\ldots,a_k)=g(a_1,\ldots,a_k)$ for every $a_1,a_2,\ldots,a_k$ taken from a commutative ring $R$, where $f$ and $g$ are polynomials with integer coefficients (here "integer" means a finite sum of zeros, 1s or -1s; it doesn't matter if the ring has finite characteristic). If you can prove that the same form of identity holds when $R$ is replaced by a nonempty open subset of $\mathbb C$, the identity will automagically becomes true for every commutative ring.
  6. If you'd like to go more abstract, you may also prove that $f(X_1,\ldots,X_k)=g(X_1,\ldots,X_k)$ directly for $f,g\in\mathbb Z[X_1,\ldots,X_k]$, where $X_1,\ldots,X_k$ denote $k$ different indeterminates. This and the idea in (5) are collectively called the method of universal identities. See, e.g. the expository paper Universal identities written by Keith Conrad. The proof of Cayley-Hamilton theorem mentioned in (4) using diagonalisable matrices, for instance, has a more elegant counterpart in this setting. See the highest voted comment in this answer on MO (the proof is in the comment, not in the answer) and this answer on MSE.
user1551
  • 149,263
  • Thanks for your answer! Can you clarify point 3 - I mean in point 3, what are the roots of the equation $p(A)=0$ when defined over $R$ ? I surely do not have $n$ roots all the time ? – roni Jul 03 '16 at 05:56
  • @roni Of course you don't have a full set of eigenvalues all the time, but the statement of Cayley-Hamilton theorem does not require the existence of eigenvalues. It's all about characteristic polynomial, that is always defined (by $p(x)=\det(xI-A)$) regardless of the field. – user1551 Jul 03 '16 at 06:05
  • Thanks! could you please explain what do you mean by diagonalisable matrices are dense in $M_n(C)$? – roni Jul 03 '16 at 06:12
  • @roni It means the closure of all diagonalisable matrices is the whole matrix space, i.e. every $A$ is the limit of a sequence of diagonalisable matrices. By a change of basis, suppose $A$ is triangular. Then $A_t=A+\operatorname{diag}(t,t^2,t^3,\ldots,t^n)$ has $n$ distinct eigenvalues when $t>0$ is sufficiently small. Hence $A_t$ is diagonalisable for all small $t>0$. Consequently, $A$ is the limit point of a sequence of diagonalisble matrices. – user1551 Jul 03 '16 at 06:37