7

I was wondering if there was an intuitive way to think about determinants of matrices that represent linear transformations in abstract vector spaces over arbitrary fields.

There are many posts about the intuition for the determinent of n-tuple spaces over the field of real numbers where they represent how the volume of the shape outlined by the basis vectors change and about how the determinant is negative should two basis vectors "flip about". There is lots on this here: What's an intuitive way to think about the determinant?

However I am wondering how the idea of a determinant generalities to more abstract vector spaces over arbitrary fields. Does it actually mean something intuitively or does it just become some computation. I suppose that a larger determinant simply means that the basis vectors are transformed to carry more "magnitude" of some sort. More importantly what would it mean to have a negative determinent in this context, i.e. for the basis vectors of a more abstract space such as that of polynomials to "flip about" as they would for classical n-tuple vectors.

What is the best mindset to approach this topic as I am learning it?

  • 2
    I've also thought of determinant as some "linear dependence measure": in many cases the most important information it carries is whether it is zero or not. As a universal alternating linear map, its value, if non-zero, does not say much if we allow rescaling; but vanishing of determinant always implies linear dependence, even in arbitrary field. – Cave Johnson Dec 14 '17 at 00:44

3 Answers3

5

I don't know if this is the answer you are looking for, but the term 'intuition' is very obscure and undefined, so I'll try anyway. I like thinking of determinant as an invariant of matrices. By this I mean that the determinant gives us sort of classification scheme of matrices. You probably know that two matrices $A,B$ are called similar if there exist a non-singular matrix $P$ such that

$$A=PBP^{-1}$$

In this case, you get as a consequence that also $\det A=\det B$. In other words, $\det A$ is invariant under similarity transformations $A\mapsto PAP^{-1}$.

This is very much like the genus of surfaces: a topological classification scheme that classifies surfaces by counting 'holes'. This kind of thinking might be too abstract, but I am not sure you can in general get geometric intuition like 'a measure of volume' as you can get in $\mathbb{R}^{n}$. Nevertheless, I think this kind of point of view is very important. It shows up over and over again everywhere in mathematics and physics.

eranreches
  • 6,071
  • 3
    This doesn't distinguish the determinant from the other coefficients of the characteristic polynomial, though, e.g. the trace. – Qiaochu Yuan Aug 07 '24 at 21:51
3

This question was just re-asked recently; I closed it as a duplicate of this one but I'll answer it here.

Everybody who wants to understand linear algebra should, at some point, compute the inverse of a general $2 \times 2$ matrix $A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$. You can do it by row reducing the augmented matrix

$$\begin{bmatrix} a & b & 1 & 0 \\ c & d & 0 & 1 \end{bmatrix}$$

(sorry, I can't figure out how to typeset the $\mid$ in an augmented matrix here). What I mean by "general" here is that $a, b, c, d$ are variables, and that at any point in the calculation you should feel free to assume that if some expression is not identically zero then you can divide by it, so you don't have to deal with any casework in the row reduction. If you really want to understand the determinant you should do this computation right now, before reading the rest of this answer.


But if you want me to spoil it, the answer you get at the end is

$$A^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ - c & a \end{bmatrix} = \frac{1}{\det A} \text{adj}(A).$$

You see that $\det(A)$ has appeared very naturally in the denominator! The rest of the inverse is a matrix called the adjugate. This computation (which is essentially Cramer's rule) turns out to be valid not only over every field, but even over every commutative ring $R$, provided that $\det A \in R$ is invertible; moreover this condition also turns out to be necessary, so:

Proposition: A matrix $A \in M_2(R)$ over a commutative ring $R$ has the property that $A$ is invertible (with inverse $A^{-1} \in M_2(R)$) iff $\det(A) \in R$ is invertible.

So the determinant determines whether a matrix is invertible; that's why it's called that. (Well, almost.) Unlike the volume interpretation of the determinant, which is extremely specific to $\mathbb{R}$ (and already does not explain what the determinant of, say, a matrix over $\mathbb{C}$ means), this interpretation of what the determinant does for us works over every commutative ring.

This is going back to the simple, easy-to-understand historical roots of linear algebra: solving systems of linear equations. $A$ having an inverse means exactly that every linear equation $Ax = b$ has a unique solution $x = A^{-1} b$, and again this works over every commutative ring.

Of course I only explained how this works for $2 \times 2$ matrices. Our esteemed mathematical ancestors were made of tougher stuff than us, and to them it would have been nothing to also perform the same row reduction calculation for $3 \times 3$, $4 \times 4$, and maybe even $5 \times 5$ matrices; this is much more laborious but in principle a finite calculation that anyone who had ever heard of row reduction (which is thousands of years old!) could do with time, patience, and a big enough chalkboard. If you do this calculation a really remarkable thing happens: even though in the middle of the row reduction you need to divide by all kinds of crazy stuff, in the end the final expression for $A^{-1}$ still takes the form

$$A^{-1} = \frac{1}{\det A} \text{adj}(A)$$

where $\det A$ is a single polynomial which appears in every denominator of the inverse, and $\text{adj}(A)$ is a matrix whose entries are polynomials in the entries of $A$. So we again see that there is a polynomial which determines whether a matrix is invertible, and we also call this polynomial the determinant.

This, to my mind, is by far the best way to motivate the determinant. It's just the polynomial you are forced to write down when you try to solve systems of linear equations. However, the structure of the row reduction of the general $n \times n$ matrix is very complicated, so this isn't a great way to get a handle on what the determinant actually looks like concretely and how to actually compute with it; for that other approaches are required. But I think everyone should see this motivation for the determinant at least once; it's very simple and algebraic, you don't need to introduce a complicated concept like the exterior powers at all.

Also, regarding this:

I suppose that a larger determinant simply means that the basis vectors are transformed to carry more "magnitude" of some sort. More importantly what would it mean to have a negative determinent in this context, i.e. for the basis vectors of a more abstract space such as that of polynomials to "flip about" as they would for classical n-tuple vectors.

Over an arbitrary field it is not possible to say either "larger" or "negative"! Think, for example, about the case of a finite field. There are just zero and nonzero elements, and none of the nonzero elements are any smaller or larger than the others (this field admits no orderings), nor are any of them negative. In this abstract setting mainly the determinant is either zero or nonzero, which is consistent with what I said above about how it determines whether a matrix is invertible or not. This isn't to say that the value of the determinant isn't important - for example in some contexts it's important to understand whether it's a square or not - but "larger" or "negative" as concepts don't apply here. (Another reason that the volume interpretation is really limited!)

Qiaochu Yuan
  • 468,795
  • 2
    Although it sounds presumptuous, I would say that this is "THE right answer". I will have to add an answer, as a complement, when I find the time, but I just wanted to mention that the textbook by Steven J. Leon introduces determinants in the spirit of your answer. – Abdelmalek Abdesselam Aug 08 '24 at 14:02
2

This is a complement to Qiaochu's very nice answer. The idea is to put determinants, and algorithms run on generic objects, in a broader context, with pointers to unsolved research questions.

Rather than look at the problem of matrix inversion which leads to longer computations, let us look the homogeneous linear system $AX=0$ where $A$ is a generic $n\times n$ matrix and $X$ is a column vector. We will run the Gaussian elimination on $A$ where all the entries are treated as formal variables. Of the three row operations (multiplying a row by a nonzero number, exchanging rows, adding a multiple of a row to another), we will only use the third one. We will also only aim to turn the matrix into upper triangular form and will not bother cleaning above the diagonal as in the matrix inversion problem. Then for $n=2$, the matrix we get is $$ \begin{pmatrix} a_{11} & a_{12}\\ 0 & \frac{a_{11}a_{22}-a_{12}a_{21}}{a_{11}} \end{pmatrix}\ . $$ For $n=3$, the final matrix looks like $$ \begin{pmatrix} a_{11} & \ast & \ast\\ 0 & \frac{a_{11}a_{22}-a_{12}a_{21}}{a_{11}} & \ast \\ 0 & 0 & \frac{D_3}{a_{11}a_{22}-a_{12}a_{21}} \end{pmatrix}\ , $$ with $$ D_3=a_{11}a_{22}a_{33}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32} -a_{12}a_{21}a_{33}-a_{13}a_{22}a_{31}-a_{11}a_{23}a_{32}\ . $$ In general, I believe we should get a triangular matrix where in the $(i,i)$ spot the entry is $\frac{D_i}{D_{i-1}}$ where $D_i$ is the principal minor determinant of size $i$ sitting in the top left corner of the matrix.

As in the answer by Qiaochu, we see the determinant polynomial appearing ex nihilo, just from running the Gaussian elimination algorithm, provided we do this for a generic matrix where the entries have no numerical values but are treated as formal indeterminates.

The above is, if I remember correctly (I don't have the book in front of me), the way Steven J. Leon introduced determinants in his undergraduate textbook on linear algebra.

Note that for a system $AX=0$, the main question is whether there is a nontrivial solution, i.e., other than $X=0$. The determinant gives us an iff criterion for this question. This is because we are secretly doing projective geometry where there are no goofy mishaps like expected solutions not being found because they escaped to infinity. This remark allows us to put the determinants and the motivation for them in a much wider context. Consider $n$ homogeneous polynomials $F_1(x_1,\ldots,x_n),\ldots,F_n(x_1,\ldots,x_n)$ of respective degrees $d_1,\ldots,d_n$. Let us write them as $$ F_i(x)=\sum_{\alpha\in\mathbb{N}^n,|\alpha|=d_i}a_{i,\alpha}x^{\alpha} $$ where $\alpha=(\alpha_1,\ldots,\alpha_n)$ is a multiindex, with length $|\alpha|:=\alpha_1+\cdots+\alpha_n$, and we used the shorthand notation for monomials $x^{\alpha}:=x_1^{\alpha_1}\cdots x_{n}^{\alpha_n}$. One can now ask the same question as before, i.e., when does there exist a nontrivial solution $x:=(x_1,\ldots,x_n)\neq 0$ (in an algebraic closure of the field at hand) for the system $$ \left\{ \begin{array}{ccc} F_1(x) & = & 0\ , \\ & \vdots & \\ F_n(x) & = & 0\ . \end{array} \right. $$ It turns out there is a unique (up to scale) irreducible polynomial in the coefficients of all the $F$'s, which vanishes iff a nontrivial solution $x$ exists. This is the multidimensional resultant ${\rm Res}_{d_1,\ldots,d_n}(a)$, where $a$ denotes the collection of all the $a_{i,\alpha}$, seen as indeterminates or formal variables. In the particular linear case $(d_1,\ldots,d_n)=(1,\ldots,1)$, this resultant is the determinant of the matrix $A$ made of the coefficients of the $n$ linear forms $F_1,\ldots,F_n$. The up to scale ambiguity is usually lifted by requiring the resultant to be equal to $1$ when $F_i(x)=x_i^{d_i}$, for all $i$, $1\le i\le n$. We thus see Gaussian elimination and determinants as a special case of the much wider elimination theory, which as mentioned in the MacTutor page linked to in Qiaochu's answer, started about two thousand years ago, I believe in the eighth of The Nine Chapters of the Mathematical Art.

Another characterization of resultants, and therefore determinants, in the context of this wider elimination theory is via the notion of Trägheitsformen or inertia forms. Consider, in the ring $\mathbb{Q}[a]$, the ideal $I$ of polynomials $R(a)$ for which there exist polynomials $G_1(a,x),\ldots,G_n(a,x)$ in $\mathbb{Q}[a,x]$, and a multiindex $\gamma\in\mathbb{N}^n$, such that the Bézout relation $$ x^{\gamma} R(a)=F_1(a,x) G_1(a,x)+\cdots F_n(a,x)G_{n}(a,x) $$ holds identically. Again $a$ denotes the collection of the the $a_{i,\alpha}$, and $x$ denotes the collection of the $x_i$ variables. The polynomials $F_i$ are now seen as polynomials in both the $a$ and $x$ variables, with coefficients equal to $1$. This ideal $I$ is nonzero, prime and principal. Its generator, unique up to scale, is the resultant ${\rm Res}_{d_1,\ldots,d_n}(a)$. Specializing this to the linear case gives another characterization of determinants.

Now here is a research problem (perhaps too difficult). Take $r$ (instead of $n$) homogeneous polynomials $F_i(x_1,\ldots,x_n)$ with indeterminate coefficients of the form $$ F_i(x)=\sum_{\alpha\in\mathcal{A}_i}a_{i,\alpha}x^{\alpha} $$ where the $\mathcal{A}_i$ are some subsets of $\mathbb{N}^n$. Then run the Buchberger algorithm to find a Gröbner basis for the ideal generated by the $F_i$. The difficulty is to invent whatever combinatorial tool is needed to explicitly keep track of the polynomials or rational functions which arise in the intermediate and last steps of the process. This relates to the study of ideals of generic forms, with one of the famous conjectures in the area being the Fröberg Conjecture on the Hilbert series of such an ideal.

One should note that running an iterative algorithm on generic objects, which sounds like a very scary proposition, can sometimes be done. For instance the Gram-Schmidt orthogonalization process results in explicit formulas involving Gram determinants. In the above elimination problem with $r=n=2$ and dehomogenizing by setting $x_2=1$, i.e., considering the univariate nonhomogeneous polynomials $f_1(z):=F_1(z,1)$ and $f_2(z):=F_2(z,1)$, one can use the Euclid algorithm in order to determine the gcd of the two polynomials. In the generic case, this results in intermediate steps involving subresultants given by explicit determinantal expressions. In case $f_2=f_1'$, one obtains the subdiscriminants which, for instance, can tell how many real roots a polynomial with real coefficients has.

For more information on Trägheitsformen, see my recent article "A combinatorial formula for the coefficients of multidimensional resultants" and references therein.

For subdiscriminants, see the book by Basu, Pollack and Roy linked to in this MO answer:

https://mathoverflow.net/questions/118626/real-symmetric-matrix-has-real-eigenvalues-elementary-proof/123150#123150