Uniqueness of characteristic polynomial of linear transformation in finite fields

Question

Let $T : V \to V$ be a linear transformation of the $F$-vector space $V$. Then using the (abstract) determinant function $\det : \operatorname{Hom}(V, V) \to K$ we can define a function $$ \lambda \mapsto \det(\lambda \cdot \operatorname{id} - T) $$ from $F$ to $F$. Now if we represent $T$ by a matrix $A$ we then have $\det(\lambda \cdot \operatorname{id} - T) = \det(\lambda \cdot I - A)$ where on the RHS the determinant function is an expression in the entries of the matrix. Now if we change the basis, i.e. represent $T$ by a different matrix $B = S^{-1}AS$, then a simple calculation shows that $\det(\lambda \cdot I - B) = \det(S^{-1}(\lambda I - A)S) = \det(\lambda \cdot I - A)$ and hence the value of of the matrix expression does not depend on the basis.

But this is often used as a justification that the characteristic polynomial (i.e. the polynomial $\det(\lambda \cdot I - A)$) is independent of the of the choosen basis. But all I can derive from the above arguments is that the values of the determinant are the same, i.e. if $p(x) := \det(x\cdot I - A)$ and $q(x) := \det(x \cdot I - A)$, then $q(x) = p(x)$ for all $x \in F$ if $B = S^{-1}AS$. If $F$ is infinite, this gives that the polynomials are equal (i.e. have the same sequence of coefficients, and hence the coefficients represent also invariants of the transformation).

But the equality that for $p(x) = a_n x^n + \ldots + a_1 x + a_0$, $q(x) = b_m x^m + \ldots + b_1 x + b_0$ we have $$ p(x) = q(x) \quad \mbox{ iff } \quad m = n, ~ a_i = b_i, ~ i = 0,\ldots, n $$ does not need to hold in finite fields, for example $p(x) = x^2 + x$ and $q(x) = 0$ in $\mathbb Z/2\mathbb Z$.

So then, is the characteristic polynomial (as a formal polynomial, i.e. determined by its coefficients) still unique in the case of finite fields? And if not, do you know an example?

Many ways to see this. You can argue like Jendrik. Or, if you are more comfortable with constants, you can work inside the ring of matrices $M_n(\overline{F})$ with coefficients in an algebraic closure $\overline{F}$ of $F$. Then let $\lambda$ range over $\overline{F}$, and your problem of (formal) polynomials $x^2+x$ and $0$ giving rise to the same function disappears. Basically because $\overline{F}$ is infinite. — Jyrki Lahtonen, Sep 24 '17 at 18:10
You can interpret that determinant as the determinant of a matrix with coefficients in $F[\lambda]$. Then there's no problem. This seems easier to me than the other options presented, although they're all also good and informative. — Qiaochu Yuan, Sep 24 '17 at 20:07

score 8 · Answer 1 · edited Mar 01 '21 at 09:47

The identity you proved $$\det(\lambda \cdot I - B) = \det(S^{-1}(\lambda I - A)S) = \det(\lambda \cdot I - A)$$ is valid as an identity of formal polynomials, not just an equality for all values of $\lambda$.

This is because both sides are polynomials in all of the variables involved, with integer coefficients. You have already proved the identity in fields of characteristic zero. That means that both sides must be identical as formal polynomials. But that implies that over any commutative ring, both sides expand to the same formal polynomial, because the operations involved in expanding out the polynomial (taking the determinant followed by expanding using the commutative, associative, distributive properties) are valid in every commutative ring. The general principle is sometimes called the "principle of permanence of identities" (one reference for this is Artin's book Algebra, Section 12.3, p. 456-457):

To prove that an identity of formal polynomials with integer coefficients (in any number of variables) holds in every commutative ring, it is sufficient to prove it in a single field of characteristic zero.

("Characteristic zero" is important, because I can prove 2=0 in a field of characteristic 2, but that's not valid in every commutative ring.)

Here's an example of this type of argument for a simpler problem:

Prove that for 2 by 2 matrices $A$ and $B$ over a commutative ring $R$, $\det(A) \det(B) = \det(AB)$.

Written out in general form, the identity we want to prove is (the vertical bars denote determinant): $$\begin{vmatrix} a & b \\ c & d \end{vmatrix} \cdot \begin{vmatrix} e & f \\ g & h \end{vmatrix} = \begin{vmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{vmatrix}$$

And taking the determinant gives the following form of the identity that we want to prove:

$$ (ad-bc)(eh-fg) = (ae+bg)(cf+dh) - (ce+dg)(af+bh) \tag{*}$$

Here we are treating $a,b,c,d,e,f,g,h$ as indeterminates, not as elements of $R$. It is clear that both sides of the proposed identity are polynomials with integer coefficients in $a,b,c,d,e,f,g,h$. So far, nothing has depended on $R$, because the matrix multiplication and determinant operations are the same no matter what (commutative) ring we're working in.

Now imagine we don't know anything else about matrices or determinants, and we're asked to prove identity (*). How would we do it? The obvious thing is to expand both sides of (*) and collect like terms. If we get the same expanded form on both sides, then we would have proven the identity. Furthermore, this proof would be valid over any commutative ring $R$, because the process of expanding and collecting terms is valid over any commutative ring. It only involves the commutative, associative, and distributive properties, and integer arithmetic, all of which are valid over every commutative ring.

Next, imagine that we don't want to do all this expansion, so we find some other method of proving this identity, but our proof is valid only over one particular field of characteristic zero (say, the complex numbers, where we could use something like topology which wouldn't be valid over a general $R$). Having proven this identity over $\mathbb{C}$, we go back and ask ourselves what would happen if we expand both sides of (*). Must we get the same polynomial on both sides? We must, because if we got two different polynomials on the two sides, then there would have to be some complex numbers we could substitute for $a,b,\ldots,h$ that would produce different values on the two sides. That would contradict the fact that we've proven the identity over $\mathbb{C}$. So, when we expand both sides, in fact the polynomials are the same on both sides, so as we concluded previously, the identity is valid over any commutative ring $R$.

So it suffices to check the identity over a single field of characteristic 0.

What you mean when you say the polynomials have integer coefficients? The coefficients are from a field $F$ (which is $\ne \mathbb Z$), or not? — StefanH, Sep 24 '17 at 18:01
There is a unique map $\mathbb{Z} \to F$ for any field $F$. The coefficients will come from the image of that map. Not only that, but each coefficient will be the image of the same integer regardless of what field you choose. That's because all the operations come from a single polynomial defined over $\mathbb{Z}$. For example, for determinant: the determinant of a 2x2 matrix is always $ad-bc$, regardless of the field (or ring), and similarly for matrices of any size. Matrix multiplication is given in the terms of the matrix entries by the same polynomial, regardless of the ring. — Ted, Sep 24 '17 at 18:55
There is a slight issue in that you have inverses involved, so you actually have rational functions not polynomials, but the principle extends to that case too. — Ted, Sep 24 '17 at 18:56
May you add an example for the usage of your map $\mathbb Z \to F$, as I see it, it is not surjective so I am not sure how you will "lift up" a polynomial in $F$ to one in $\mathbb Z$? — StefanH, Sep 24 '17 at 20:54
@Stefan: what Ted is doing is replacing every entry of $A, B$, and $S$ with a different variable, and working over the polynomial ring on those variables (together with the inverse of the determinant of $S$, so that $S^{-1}$ can be written down). There are $3n^2 + 1$ variables involved, including $\lambda$, and as a polynomial in all of these variables, all of the expressions involved have integer coefficients. This is a bit of a tricky proof technique to get used to but it has many applications. — Qiaochu Yuan, Sep 24 '17 at 22:31
@StefanH I added an example of using this principle for a simpler problem. — Ted, Sep 25 '17 at 01:33
Ah I see, if I see it right, the fact that formal polynomials and polynomials as function coincide in fields of characteristic 0, so when we prove something in the specific field (using the polynomial not formally, but substituting values) then it must also hold for the "general" formal polynomials, right? — StefanH, Sep 26 '17 at 19:17

Jendrik Stelzner · Accepted Answer · 2017-09-24T20:02:37.327

The characteristic polyonmial $p_A(x)$ of a matrix $A \in \operatorname{M}_n(F)$ is defined as $$ p_A(x) := \det(x I - A) \in F[x], $$ i.e. $p_A(x)$ is the determinant of the matrix $xI - A \in \operatorname{M}_n(F[x])$. (Instead of the commutative ring $F[x]$ one could also use the associated field $F(x)$.)

The argument you give still applies: If $B \in \operatorname{M}_n(F)$ is similar to $A$ over $F$, then they are also similar over $F[x]$, and therefore $xI - A$ and $xI - B$ are similar over $F[x]$. Since similar matrices have the same determinant, it follows that $$ p_A(x) = \det (xI - A) = \det (xI - B) = p_B(x). $$

Note that we are always working over the commutative ring $F[x]$ (or field $F(x)$), so we are only working with polynomials themselves, and not with their associated polynomial functions. So the above equality $p_A(x) = p_B(x)$ really is an equality of polynomials, and not just of their associated polynomial functions.

PS:

As was pointed out in the comments, this leads to the question how to deal with the expression $\det(x \operatorname{id}_V - T)$. There seem to be at least two ways:

Don’t use it

Don’t define the characteristic polynomial of $T$ as $\det(x \operatorname{id}_V - T)$. Instead proceed as follows:

Start by defining the characteristic polynomial of $A \in \operatorname{M}_n(F)$ as $p_A(x) := \det(xI - A)$. Because we can regard $F$ is a subring of $F[x]$, or subfield of $F(x)$, this makes sense.
Then show that similar matrices have the same characteristic polynomial (as done above).
To define the characteristic polynomial of $T \colon V \to V$ take any basis $\mathcal{B}$ of $V$, let $A \in \operatorname{M}_n(K)$ be the matrix of $T$ with respect to $\mathcal{B}$, and set $p_T(x) := p_A(x)$. Since similar matrices have the same characteristic polynomial, this is well-defined.

This still give you everything you need without making sense of $\det(x\operatorname{id}_V - T)$. Note that for every scalar $\lambda \in K$ we still have that $$ p_T(\lambda) = p_A(\lambda) = \det(\lambda I - A) = \det(\lambda \operatorname{id}_V - T), $$ so we can still use the same expression $\det(\lambda \operatorname{id}_V - T)$ when plugging in a scalar $\lambda$ for $x$.

Note that this approach of defining the characteristic polynomial $p_T(x)$ only needs that similar matrices have the same characteristic polynomial, which is precisely what you were concerned about in your question.

Extension of scalars

You can also use extension of scalars, if you are familiar with it:

We can extend the $F$-vector space $V$ to an $F(x)$-vector space $V_{F(x)}$ such that

$V \subseteq V_{F(x)}$ is an $F$-linear suspace,

the linear map $T \colon V \to V$ extends uniquely to an $F(x)$-linear map $T_{F(x)} \colon V_{F(x)} \to V_{F(x)}$ (i.e. we have that $T_{F(x)}(v) = T(v)$ for every $v \in V$)

any $F$-basis $\mathcal{B} = (v_1, \dotsc, v_n)$ of $V$ is also an $F(x)$-basis of $V_{F(x)}$,

$[T]_{\mathcal{B}} = [T_{F(x)}]_{\mathcal{B}}$, i.e. the matrix of $T$ with respect to the $F$-basis $\mathcal{B}$ of $V$ coincides with the matrix of $T_{F(x)}$ with respect to the $F(x)$-basis $\mathcal{B}$ of $V_{F(x)}$. (This is a direct consequence of the previous two points).

If $\mathcal{B} = (v_1, \dotsc, v_n)$ is any $F$-basis of $V$, with respect to which $T$ is given by the matrix $A \in \operatorname{M}_n(F)$, then $A$ will also be the matrix of $T_{F(x)}$ with respect to $\mathcal{B}$, when regarded as an $F(x)$-basis of $V_{F(x)}$. Then $x I - A$ will be the matrix of $x \operatorname{id}_{V_{F(x)}} - T_{F(x)}$ with respect to $\mathcal{B}$. (The expression $x \operatorname{id}_{V_{F(x)}} - T_{F(x)}$ makes sense because $V_{F(x)}$ is an $F(x)$-vector space.) With this one can define the characteristic polynomial $p_T(x)$ as $p_T(x) := \det(x \operatorname{id}_{V_{F(x)}} - T_{F(x)})$.

As an example, consider the case of $V = F^n$.

In this case, the extensions of scalars $V_{F(x)} = (F^n)_{F(x)}$ is given by $(F^n)_{F(x)} = F(x)^n$. Then $F^n \subseteq F(x)^n$ is an $F$-linear subspace, and the standard basis $\mathcal{B} = (e_1, \dotsc, e_n)$ of $F^n$ is clearly an $F(x)$-basis of $F(x)^n$.

The $F$-linear map $T \colon V \to V$ is necessarily given by multiplication with the matrix $A \in \operatorname{M}_n(F)$ whose $j$-th column is $T(e_j)$, and the induced $F(x)$-linear map $T_{F(x)} \colon F(x)^n \to F(x)^n$ will be given by multiplication with the same matrix $A$; this makes sense since $\operatorname{M}_n(F) \subseteq \operatorname{M}_n(F(x))$. Hence $x \operatorname{id}_{F(x)^n} - T_{F(x)}$ will be given by the matrix $x I - A$ with respect to $\mathcal{B}$, just as promised.

Okay, so we interpret $\det(xI - A)$ as a polynomial, as a formal expression, in the ring $F[x]$. But then the equality $\det(xI - T) = \det(xI - A)$ does not makes much sense, because on the LHS where are using the determinat $\det : \mbox{Hom}(V,V) \to K$, but on the RHS there are standing these formal expressions, and not numbers (or field elements)... — StefanH, Sep 24 '17 at 18:17
Okay, their seems to be an intermediate step where we interpret the formal polynomial again as a function in this equality. — StefanH, Sep 24 '17 at 18:25
You’re right that the expression $\det(x \operatorname{id} - T)$ doesn’t make sense. I added two possible ways how to fix/circumvent this problem (I usually use the first one). — Jendrik Stelzner, Sep 24 '17 at 19:55
Also note that the questions why similar matrices have the same characteristic polynomial, and how the characteristic polynomial of a linear operator is defined are very much related, but still different. This is something which can be easily overlooked. — Jendrik Stelzner, Sep 24 '17 at 20:06

score 2 · Answer 3 · answered Sep 24 '17 at 20:31

2

But all I can derive from the above arguments is that the values of the determinant are the same, i.e. if $p(x) := \det(x\cdot I - A)$ and $q(x) := \det(x \cdot I - A)$, then $q(x) = p(x)$ for all $x \in F$ if $B = S^{-1}AS$.

There's your mistake: you want to do this calculation for the specific element $x \in F[x]$, or if you insist on working only over fields, for the element $x \in F(x)$.

answered Sep 24 '17 at 20:31

Okay, I understand you interpret it as a polynomial in $x$ and then use the usual determinant formulas. But why is this a mistake if I treat $x$ as a variable (not in the formel sense, but a real variable where I can substitute values from $F$), is this really a mistake or just a too narrow view on the issue? – StefanH Sep 24 '17 at 20:47
@Stefan: it's not a mistake in the sense that you wrote down a false statement. Hurkyl is just pointing out that you can write down a stronger statement than what you wrote down, and this stronger statement is enough to guarantee the uniqueness of the characteristic polynomial. – Qiaochu Yuan Sep 24 '17 at 22:33

Uniqueness of characteristic polynomial of linear transformation in finite fields

3 Answers3

Don’t use it

Extension of scalars

Linked