6

The Cayley-Hamilton theorem is equivalent to: Let $R$ be a ring and let $M_n(R)$ be $n\times n$ matrices over $R$. Then the minimal polynomial of $A \in M_n(R)$ over $R$ divides the characteristic polynomial of $A$.

For instance. In order to reduce the confusion of having $X = $ a matrix in a polynomial. Let $R'$ be the subring of matrices $\{ a I : a \in R\}$. It's clearly isomorphic to $R$. Now consider the characteristic polynomial as an element of $R'[X]$.

RobPratt
  • 50,938
  • 4
    Simpler, more abstract proof than which one? That is, which proof do you have in mind and that you find unsatisfactory? – M Turgeon Dec 13 '13 at 16:32
  • The one on wikipedia. – Daniel Donnelly Dec 13 '13 at 16:33
  • 5
    There are half a dozen on Wikipedia, of which I personally endorse the more abstract ones (having put them there). The proof with "polynomials with matrix coefficients" is seemingly the one you want, and is actually quite simple if you subtract the substantial fluff placed there for explanation. – Ryan Reich Dec 13 '13 at 16:35
  • @MTurgeon That proof is actually given in Wikipedia also, under "Preliminaries". – Ryan Reich Dec 13 '13 at 16:41
  • 3
    Also, phrasing this as a question of divisibility by the minimal polynomial is unproductive unless you have some other way of finding the minimal polynomial: otherwise, this is just the definition of the minimal polynomial. – Ryan Reich Dec 13 '13 at 16:56
  • 2
    I, personally, read the proof in the Schaum's Series book using the classical adjoint (adjugate they call it now in many places) matrix while a Freshman (decades ago) and fell in love with it since then. – DonAntonio Dec 13 '13 at 18:58

3 Answers3

9

Here is my proof of the Cayley Hamilton theorem. I'll share the intuition behind it first:

Intuition in a Nutshell: For any endomorphism $\Phi : V \rightarrow V$, we have a factorization of the determinant $\text{det}(\Phi) I$ into the adjugate and the matrix itself: $$ \text{det}(\Phi) I = \text{adj}(\Phi) \circ \Phi$$ We want to use this to get a factorization of the characteristic polynomial $p(t)$ of $\phi$ into some polynomial analogous to the adjugate and a linear term $t - \phi $: $$p(t) = f(t)(t - \phi)$$ These two factorizations are analogous, and in fact, if we get the formality right, we can view these as corresponding factorizations in isomorphic rings $\text{End}(V \otimes k[t])$ and $\text{End}(V) \otimes k[t]$.

Let's try to work this out a little more formally. The main question is, what are the two isomorphic rings I mentioned in which these are corresponding factorizations?

Let $V$ be a finite dimensional vector space over a field $k$. One of the rings is $\text{End}_k(V)[t] = \text{End}_k(V) \otimes k[t]$. The characteristic polynomial $p(t)$ of $\phi \in \text{End}_k (V)$ naturally lives in $\text{End}_k(V)[t]$ from the natural map $\text{End}_k(V) \rightarrow \text{End}_k(V) \otimes_k k[t]$. In other language, we view $t \text{Id}_V - \phi $ as having endomorphisms as coefficients, and then take the determinant, which is then in $\text{End}_k (V)[t]$.

The other ring is $\text{End}_{k[t]}(V \otimes_k k[t])$. $\Phi := 1 \otimes t - \phi \otimes 1 $ is an element in this ring, and we have a factorization $\text{det}(\Phi) 1_{V \otimes_k k[t]} = \text{adj}(\Phi) \Phi$.

In the isomorphism $$\text{End}_k ( V \otimes_k k[t]) \cong \text{End}_k (V)[t]$$ We have corresponding elements $$\Phi \leftrightarrow t - \phi$$ and $$\text{det}(\Phi) \leftrightarrow p(t)$$ Therefore, the factorization $\text{det}(\Phi) 1_{V \otimes_k k[t]} = \text{adj}(\Phi) \Phi$ corresponds to a factorization $p(t) = f(t)(t-\phi)$ in $\text{End}_k (V) [t]$. And that's the whole idea!


If you want a more formal version, and a construction of the claimed isomorphism, read on!

Theorem: Let $V$ be a finitely generated $k$-module. If $\phi : V \rightarrow V$ is a $k$-linear map, then the evaluation homomorphism $\text{ev}_{\phi} : k[t] \rightarrow \text{End}_k (V)$ sends the characteristic polynomial $\text{char}(\phi)$ to $0$.

Let's start by constructing an isomorphism $F : \text{End}_{k} (V)[t] \rightarrow \text{End}_{k[t]} (V \otimes_k k[t])$ as follows. We have isomorphisms $$\text{End}_{k[t]} (V \otimes_k k[t]) \cong \text{Hom}_k (V, \text{Hom}_{k[t]}(k[t], V \otimes_k k[t])) \cong \text{Hom}_k(V, V \otimes_k k[t])$$

These isomorphisms can be established by creating canonical maps in both directions and showing that they are inverse to each other. Now we have a canonical map in a single direction,

$$\text{End}_k (V) \otimes_k k[t] \rightarrow \text{Hom}_k(V, V \otimes_k k[t])$$

sending $\phi \otimes t^n$ to the map sending $v$ to $\phi(v)t^n$. This is injective, and surjective since $V$ is finitely generated. Composing these isomorphisms gives an isomorphism $F : \text{End}_{k} (V)[t] \rightarrow \text{End}_{k[t]} (V \otimes_k k[t])$.

Now we argue as before. View $t - \phi$ as a $k[t]$-linear endomorphism of $V \otimes_k k[t]$. Under the isomorphism $F$, $\text{char}(\phi)$ maps to $\text{det} (t - \phi) 1_{V \otimes_k k[t]} )$ and $F ( t - \phi ) = t - \phi$. $t - \phi$ divides $\text{det}(t - \phi) 1_{V \otimes_k k[t]}$ in $\text{End}_{k[t]} (V \otimes_k k[t])$, since $\text{det} (t - \phi) 1_{V \otimes_k k[t]} = \text{adj}(t - \phi) (t - \phi)$, where $\text{adj}(t - \phi)$ is the adjugate matrix. Therefore, $t - \phi$ divides $\text{char}(\phi)$ in $\text{End}_{k}(V)[t]$. So $\text{char}(\phi)$ has $\phi$ as a root in $\text{End}_k(V)$, so that the evaluation homomorphism $\text{ev}_{\phi} : k[t] \rightarrow \text{End}_k (V)$ sends the characteristic polynomial $\text{char}(\phi)$ to $0$.

5

This has no pretense to be "The" answer!

I am no algebraist but I remember a nice proof I was taught when I was I student, I thought I'd share it.

In a nutshell: True for diagonalizable matrices, then use "algebraic continuation".

Let's write down some details.

Lemma ("algebraic continuation"): Let $k$ be an infinite field. Let $P, Q \in k[X_1, \dots, X_n]$ be polynomials of $n$ variables with $Q \neq 0$ . If $P$ vanishes on the set $\{x \in k^n ~\colon~ Q(x) \neq 0\}$, then $P = 0$.

This lemma expresses that "non-empty open sets are dense in the Zariski topology". It's not hard to show (I'll give you a hint if you want).

Now:

Theorem (Cayley-Hamilton): Let $R$ be a commutative unital ring. Let $A \in M_n(R)$ be a square matrix and denote by $\chi_A(X) \in R[X]$ its characteristic polynomial. Then $\chi_A(A)$ is the zero matrix.

Let's give a proof when $R = k$ is an infinite field for the moment. By the "algebraic continuation" lemma, it is enough to show that the theorem is true when $A$ lies in some "dense open set". More precisely, each coefficient of the matrix $\chi_A(A)$ is a polynomial in the $n^2$ coefficients of $A$. It is enough to show that it vanishes on some set $\{Q \neq 0\}$, where $Q$ is a nonzero polynomial in $n^2$ variables. Let's take $Q(A) = \mathrm{Disc}(\chi_A)$ (the discriminant of the polynomial $\chi_A$). The set where $Q \neq 0$ consists precisely of matrices $A$ whose eigenvalues are all distinct in an algebraic closure $\bar{k}$ of $k$. Such matrices are diagonalizable over $\bar{k}$ so it is easy to check that $\chi_A(A)$ = 0 (I'll let you do that).

We're done!

Wait, how does this extend to an arbitrary unital ring? Well, each of the coefficients of the matrix $\chi_A(A)$ is actually a polynomial in the $n^2$ coefficients of $M$ with integer coefficients. These polynomials must be zero because we showed that Cayley-Hamilton holds for $\mathbb{Q}$ hence $\mathbb{Z}$. (NB: I think some people would say something like "$\mathbb{Z}$ is an initial object in the category of unital rings" or whatever).

Seub
  • 6,282
  • 3
    This is not about $\mathbb Z$ being an initial object in a category. The argument is somewhat more complicated. A polynomial identity valid when evaluated on all rationals must be valid as a polynomial identity in a polynomial ring over $\mathbb Q$, thus in a polynomial ring over $\mathbb Z$, and that is an initial object in the category of unital rings with some chosen elements. – darij grinberg Dec 15 '13 at 08:02
4

Let $R$ be a commutative unital ring and $M$ be a finite-rank free unital $R$-module. Let $$ a\colon M\to M $$ be an endomorphism of $M$ and $\chi_a\in R[X]$ be the characteristic polynomial of $a$.

Consider the left action $(\triangleleft_a)$ of $\operatorname{End}_R(M)[X]\cong\operatorname{End}_{R[X]}(M[X])$ on $\operatorname{End}_R(M)$ defined by the rules:

  1. $f\triangleleft_ag = fg$ for $f\in\operatorname{End}_R(M)$,
  2. $X\triangleleft_ag = ga$.

Then: $$ (X - a)\triangleleft_a\operatorname{id}_M =\operatorname{id}_Ma - a\operatorname{id}_M = a - a = 0, $$ and \begin{align*} \chi_a(a) &=\chi_a\triangleleft_a\operatorname{id}_M\\ &=\operatorname{det}(X - a)\triangleleft_a\operatorname{id}_M\\ &=\operatorname{adj}(X - a)(X - a)\triangleleft_a\operatorname{id}_M\\ &=\operatorname{adj}(X - a)\triangleleft_a((X - a)\triangleleft_a\operatorname{id}_M)\\ &=\operatorname{adj}(X - a)\triangleleft_a0\\ &= 0. \end{align*}

Details here: https://arxiv.org/abs/2105.09285.


Here is another presentation of the same idea.

First, observe that, by abuse of notation (or by some "identifications"), \begin{align*} \operatorname{id}_M\otimes X - a\otimes 1 &\in\operatorname{End}_R(M)[X] =\operatorname{End}_R(M)\otimes_R R[X] \cong\operatorname{End}_{R[X]}(M[X]),\\ \operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1) &\in\operatorname{End}_R(M)[X] =\operatorname{End}_R(M)\otimes_R R[X] \cong\operatorname{End}_{R[X]}(M[X]), \end{align*} and $$ \chi_a =\operatorname{det}(\operatorname{id}_M\otimes X - a\otimes 1) =\operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1) (\operatorname{id}_M\otimes X - a\otimes 1). $$

Consider the four $R$-module homomorphisms defined as follows: \begin{align*} \operatorname{End}_R(M)\otimes_R R[X] \otimes_R\operatorname{End}_R(M)\otimes_R R[X] &\to\operatorname{End}_R(M)\otimes_R R[X]\\ f_1\otimes P_1\otimes f_2\otimes P_2&\mapsto f_1f_2\otimes P_2P_1,\\[2ex] \operatorname{End}_R(M)\otimes_R R[X] &\to\operatorname{End}_R(M)\\ f\otimes P&\mapsto fP(a),\\[2ex] \operatorname{End}_R(M)\otimes_R R[X] \otimes_R\operatorname{End}_R(M)\otimes_R R[X] &\to\operatorname{End}_R(M)\otimes_R R[X]\otimes_R\operatorname{End}_R(M)\\ f_1\otimes P_1\otimes f_2\otimes P_2&\mapsto f_1\otimes P_1\otimes f_2P_2(a),\\[2ex] \operatorname{End}_R(M)\otimes_R R[X]\otimes_R\operatorname{End}_R(M) &\to\operatorname{End}_R(M)\\ f\otimes P\otimes g&\mapsto fgP(a). \end{align*}

It is easy to check that these $R$-module homomorphisms form the commutative diagram $$\require{AMScd} \begin{CD} \operatorname{End}_R(M)\otimes_R R[X] \otimes_R\operatorname{End}_R(M)\otimes_R R[X] @>>>\operatorname{End}_R(M)\otimes_R R[X]\\ @VVV @VVV\\ \operatorname{End}_R(M)\otimes_R R[X]\otimes_R\operatorname{End}_R(M) @>>>\operatorname{End}_R(M) \end{CD} $$

Now, consider the image in $\operatorname{End}_R(M)$ of $\operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1)\otimes (\operatorname{id}_M\otimes X - a\otimes 1)$ under the mappings of this diagram.

When going through $\operatorname{End}_R(M)\otimes_R R[X]$, we have: $$ \operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1)\otimes (\operatorname{id}_M\otimes X - a\otimes 1) \mapsto\chi_a\mapsto\chi_a(a). $$ When going through $\operatorname{End}_R(M)\otimes_R R[X]\otimes_R\operatorname{End}_R(M)$, we have: $$ \operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1)\otimes (\operatorname{id}_M\otimes X - a\otimes 1) \mapsto\operatorname{adj}(\operatorname{id}_M\otimes X - a\otimes 1)\otimes 0 = 0 \mapsto 0. $$

Alexey
  • 2,604
  • I haven't studied it yet, but from first glance your proof look very elegant! :) – Daniel Donnelly May 18 '23 at 17:39
  • Don't forget the corollary that the truth of the claim is closed under quotients. –  May 18 '23 at 21:50
  • @Cayley-Hamilton, could you elaborate? – Alexey May 18 '23 at 22:18
  • You've obtained a relation in a free module involving the generators xₙ and the free variable x; such a relation must hold in any quotient module as well. P.S. if you're interested in pooling insights on this problem send me an email at edeany@linearlibrary.net. –  May 19 '23 at 00:13