Cayley-Hamilton...

Question

Say $A$ is a square matrix over an algebraically closed field. Say $m$ is the minimal polynomial and $p$ is the characteristic polynomial.

Of course C-H implies that $m|p$. Conversely, if we can show $m|p$ then C-H follows; the question is whether one can give a "simple", "elementary" or "straightforward" proof that $m|p$.

Note. What I really want is a proof such that I feel I actually understand the whole thing. Hence in particular no Jordan form allowed.

Edit. An Answer has appeared that shows $m|p$ in a very simple way - simply demolishes what I wrote below.

Edit. When I posted this is was an honest question that I didn't know the answer to. I think I got it; if anyone wants to say they believe the argument below (or not) that would be great.

First, it's clear that linear factors of $m$ must divide $p$:

If $m(\lambda)=0$ then $p(\lambda)=0$.

Because $m(t)=(t-\lambda)r(t)$, so $(A-\lambda)r(A)=0$. Minimality of $m$ shows that $r(A)\ne0$, hence $A-\lambda$ is not invertible, hence $p(\lambda)=0$.

If we could show that $(t-\lambda)^k|m$ implies $(t-\lambda)^k|p$ we'd be set. Some possible progress on that, first restricted to a simple special case:

If $t^2|m(t)$ then $\dim(\ker(A^2))\ge 2$.

Proof: Say $X=K^n$ is the underlying vector space. Say $m(t)=t^2q(t)$. Let $$Y=q(A)X,$$ $$B=A|_Y.$$ Then $Y\subset\ker(A^2)$. Say $d=\dim(Y)$.

Now $B^2=0$, and it follows easily that $B^d=0$. But $B\ne0$, hence $d\ge2$.

Similarly

If $(t-\lambda)^k|m$ then $\dim(\ker(A-\lambda)^k)\ge k$.

So we only need

If $\dim(\ker(A-\lambda)^k)\ge k$ then $(t-\lambda)^k|p$.

Which I gather is true, but only by hearsay; I'm sort of missing what it "really means" to say $t^2|p$.

Wait, I think I got it. Say $$m(t)=(t-\lambda)^kq(t),$$ $$q(\lambda)\ne0.$$ The "kernel lemma" shows that $$X=\ker((A-\lambda)^k)\oplus\ker(q(A))=X_1\oplus X_2.$$

Each $X_j$ is $A$-invariant, so we can define $$B_j=A|_{X_j}.$$Since similar matrices have the same determinant we can use any basis we like in calculating the determinant $p(t)$; if we use a basis compatible with the decomposition $X=X_1\oplus X_2$ it's clear that $$p_A=p_{B_1}p_{B_2},$$so we need only show that $$p_{{B_1}}(t)=(t-\lambda)^k.$$ In fact it's actually enough to show $(t-\lambda)^k|p_{B_1}$, and that's clear:

Lemma. If $B$ is a $d\times d$ nilpotent matrix then $p_B(t)=t^d$.

Proof: We're still assuming $K$ is algebraically closed; $B$ cannot have a non-zero eigenvalue.

So if $d=\dim(\ker((A-\lambda)^k)$ then $$p_{B_1}(t)=(t-\lambda)^d;$$we've already shown that $d\ge k$, so $(t-\lambda)^k|p$.

Hmm. Maybe that doesn't look all that simple. It's nonetheless the sort of thing I wanted, because I can give a one-line summary making it at least comprehensible:

One-line summary: Since $m$ splits, the kernel lemma (a simple consequence of the fact that $K[t]$ is a PID) shows that $A$ is the direct sum of operators $B_j$ such that $B_j-\lambda_j$ is nilpotent. So it's enough to prove C-H for nilpotent operators, which is not hard.

@AlvinLepik Jordan forms are great, just less elementary than what I was hoping for here. — David C. Ullrich, Sep 28 '19 at 19:03

score 7 · Accepted Answer · 2019-09-30T16:06:37.670

7

Let $\lambda$ be any eigenvalue of a minimal counterexample $A$ and choose a basis so $$A=\begin{pmatrix}\lambda&*\\0&B\end{pmatrix}.$$

Let $m(x)$ and $n(x)$ be the minimal polynomials of $A$ and $B$, respectively. Let $q(x)$ be the characteristic polynomial of $B$, where we can suppose that $n(x)|q(x)$.

Then $$(A-\lambda)n(A)=\begin{pmatrix}0&*\\0&*\end{pmatrix}\begin{pmatrix}n(\lambda)&*\\0&0\end{pmatrix}=0$$ and therefore $m(x)|n(x)(x-\lambda)|q(x)(x-\lambda)=p(x)$.

edited Sep 30 '19 at 16:06

answered Sep 28 '19 at 15:06

Not that it matters, but if $Ae_1=\lambda e_1$ then $A=\begin{pmatrix}\lambda&*\0&B\end{pmatrix}.$ – David C. Ullrich Sep 28 '19 at 18:58
The edits are fine - I was puzzled by a lot in the first version – David C. Ullrich Sep 28 '19 at 18:59
2

This presumes the existence of an eigenvalue, which makes it hard to generalize and also not quite self-contained. – darij grinberg Sep 28 '19 at 19:18
One can choose a root of the minimal polynomial since it's easy to show this is an eigenvalue. I originally did this in the proof and, in view of your comment, perhaps I should have retained this approach. – Sep 28 '19 at 19:21
@darijgrinberg That's no problem! Whether or not $p(A)=0$ is not changed by replacing $K$ by its algebraic closure. So wlog $K$ is algebraically closed. So $m$ has a zero, and it's easy to show that $m(\lambda)=0$ implies $\lambda$ is an eigenvalue. – David C. Ullrich Sep 29 '19 at 11:29
@DavidC.Ullrich Back on the grid now and would like to check something with you re. (∗,0). When teaching I have become used to 'your' way round of writing such matrices. However, my 'error' was not a careless one as I had thought - it is the way I had learnt when studying the theory of canonical forms etc. from Herstein 'Topics in Algebra', pages 225 and 229 in particular. So my question is whether this is just a matter of taste or has your preferred way become so accepted in the years since I learnt Linear algebra that my proof will look wrong to people? – Sep 29 '19 at 20:48
@S.Dolan I don't have a copy of the book. I conjecture that your point is you were regarding $A$ as defining a linear operator $T_A$ by $T_Ax=xA$ instead of $Ax$. If that's not your point you might explain further. If that is your point: I'm no expert on linear algebra, but in my experience it's always $Ax$; that's seems universal to me, so that your argument does seem wrong to me... – David C. Ullrich Sep 30 '19 at 10:46
@DavidC.Ullrich Yes, Herstein defines an eigenvalue of a linear transformation in terms of $vT=\lambda v$ and then defines the matrix of $T$ relative to a basis so that $xA$ rather than $Ax$ is being used. – Sep 30 '19 at 16:11
I always use your order nowadays but I think that when thinking more abstractly about your interesting challenge I subconsciously reverted to an earlier approach which was, of course, consistent with Herstein's lower triangular form and the way he laid out Jordan blocks. Anyway, I've changed it now! – Sep 30 '19 at 16:15
Whatever. The proof is great - it's the one I wanted, showing that C-H really is much more elementary than, say, the Jordan form. Heh, the reason I missed it is the only way I could see to say anything about $\det A$ in terms of smaller matrices was to get a block-diagonal version - turns out "block-upper-triangular" works just as well. Fabulous. – David C. Ullrich Oct 01 '19 at 13:46
I'm glad to be of help. By coincidence, I was taking a class today (and using composition of transformations just as an example) and one student was really disconcerted that in the textbook ST meant do T first when he had been taught that you must work from left to right. I was glad that I had thought about this so recently! – Oct 01 '19 at 16:40

Cayley-Hamilton...

1 Answers1