25

Let $A$ be square. This question is a bit opinion-based unless there is a technical answer. I think it is helpful tho. Also, this question is closely related to this question : quick way to check if a matrix is diagonalizable.

Based on the answers, there is no known way to determine diagonalizability as a map $A \mapsto$ $\{$True, False$\}$. (see also note)

This doesn't happen for invertibility (determinant measures that) or over reals/complex numbers, the existence of perpendicular diagonalization/ unitarily diagonalizable using $A = A^{T}$ or $A^*A = AA^*$ conditions.

My question is is there a reason why there is no known way to check diagonalizability in general? I guess people must have tried, although possibly a very hard task. But considering determinant is also a really complicated function, yet motivatable why the above map have never been found explicitly$?$

Note: A simple answer would be it's too hard, but a better answer would be what makes it a hard task. Like, is the theory of that map really not that connected to anything we know?

Note: I chose this formalism of map $A\mapsto\{0,1\}$ basically because I thought the term 'algebraic way' to determine is ambiguous. I was looking for an explicitly writable function as a code in $A$ (i.e. finite, and requiring only computation but not conditional arguments).

Note: Some conditional arguments in some algorithms can be written as a computation (if $x>0$: return $x$ else return $0$ could be explicitly written as max($0,x$) but if algebraic multiplicity $>1$ is not possible explicitly). I admit it is rather hard to formalize, but it is enough to check if there is a polynomial or continuous function in $A$ when equal to $0$ determines diagonalizability.

  • This doesn't talk about the map which depend on $A_{ij}$ to determine diagoalizability. – Rias Gremory Jul 05 '24 at 09:38
  • 1
    Technically, over complex numbers, unitarily diagonalizable is equivalent to $AA^\ast = A^\ast A$, not $A = A^\ast$. Over reals orthogonally diagonalizable is equivalent to $A = A^t$, though. – David Gao Jul 05 '24 at 09:48
  • $A$ is diagonalizable if and only if the characteristic polynomial splits and all Jordan blocks are $1\times 1$. So that’s your true/false questionnaire. – peek-a-boo Jul 05 '24 at 09:48
  • 1
    I know, but given a matrix, how do you know the dimensions of Jordan block as a function of $A?$ – Rias Gremory Jul 05 '24 at 09:50
  • 2
    To add to MSEU’s comment, Jordan normal form is a correct thing to do. It can be computed via an algorithm, and a matrix is diagonalizable iff the Jordan normal form is a diagonal matrix. – David Gao Jul 05 '24 at 09:50
  • @ David Gao, edited – Rias Gremory Jul 05 '24 at 09:52
  • 2
    To compute the Jordan normal form, you can compute the characteristic polynomial, then factor it. Then for each eigenvalue one may compute the associated eigenvectors. There is a Jordan block larger than $1 \times 1$ iff the multiplicity of an eigenvalue in the characteristic polynomial is strictly larger than the number of linearly independent eigenvectors associated to that eigenvalue. All of these can be computed via an algorithm. (Not sure how computationally effective, but for small matrices this can be done by hand.) – David Gao Jul 05 '24 at 09:54
  • 2
    (This is over the complex numbers, or over any algebraically closed field, for that matter. Otherwise, over reals, you need to check the characteristic polynomial can be factored into linear terms in the first place. If it cannot, then the matrix is not diagonalizable. If it can, then the same procedure as the complex case applies.) – David Gao Jul 05 '24 at 09:56
  • 3
    If the minimal polynomial is $\prod (\lambda-\lambda_i)^{m_i}$ then the size of the largest Jordan block corresponding to $\lambda_i$ is $m_i$. Hence an equivalent condition is that $A$ is diagonalizable if and only if the minimal polynomial splits and all roots have multiplicity $1$. So if you consider the minimal polynomial (which is uniquely determined by the matrix) then you get a complete answer as well. Now if you’re asking whether there is a simple “explicit” formula for these $m_i$ then the answer is 99.999999% going to be no. – peek-a-boo Jul 05 '24 at 09:57
  • 2
    @DavidGao I understand there are algorithms, there are algorithms for finding roots of polynomials as well, but this doesn't mean there is a generic formula for solving polynomials. I am not looking for algorithmic ways as well in the sense that Gauss algorithm can be used for testing invertibility, but in this case determinant is the algebraic function I would be talking about. I will make my statement more formal with an edit – Rias Gremory Jul 05 '24 at 09:58
  • 2
    @MahammadYusifov If you meant whether there is an algebraic function that tests if a given matrix is diagonalizable, that seems closer to ask if the set of diagonalizable matrix is Zariski closed or Zariski open. This is mostly definitely false. This set is not even closed/open in the Euclidean topology. – David Gao Jul 05 '24 at 10:05
  • 3
    This means there is no polynomial function (even continuous function, since the set is not even closed/open in the Euclidean topology) in $A_{ij}$ s.t. diagonalizability is equivalent to the function being $0$ or being nonzero, like for invertibility or unitary diagonalizability. – David Gao Jul 05 '24 at 10:07
  • 2
    @DavidGao found it really hard to formalize exactly what I want from the test, but I think your answer considers a very important type of such functions. Could you please post it as an answer and perhaps link some resources mentioning or motivating the result? – Rias Gremory Jul 05 '24 at 10:08
  • 1
    @MahammadYusifov I’ll write an answer to resolve this, then. – David Gao Jul 05 '24 at 10:12

3 Answers3

60

There is no continuous function $f$ in the entries of the matrix s.t. $A$ is diagonalizable iff $f(A) = 0$ or iff $f(A) \neq 0$. (Like the case for invertibility, where $A$ is invertible iff $\det(A) \neq 0$; or the case for unitary diagonalizability, where $A$ is unitarily diagonalizable iff $AA^\ast - A^\ast A = 0$.) Indeed, if such an $f$ exists, then the set of diagonalizable matrices is either closed or open. Neither is the case. We have,

$$\begin{pmatrix} 1 & 1 \\ \epsilon & 1 \end{pmatrix}$$

is diagonalizable if $\epsilon \neq 0$ (in the real case, $\epsilon > 0$). But it is not diagonalizable if $\epsilon = 0$. Thus, the set of diagonalizable matrices is not closed. Moreover,

$$\begin{pmatrix} 1 & \epsilon \\ 0 & 1 \end{pmatrix}$$

is not diagonalizable if $\epsilon \neq 0$, but it is diagonalizable if $\epsilon = 0$. Thus, the set of diagonalizable matrices is not open either.

(I presented the example in $2 \times 2$ matrices, but it can be easily adapted to $n \times n$ for any $n \geq 2$.)

David Gao
  • 22,850
  • 9
  • 28
  • 48
  • If $f$ exists, then the set of diagonalizable matrices must be clopen. – Paul Frost Jul 05 '24 at 11:51
  • 4
    @PaulFrost That’s not what I meant. I didn’t mean for both an $f$ s.t. $A$ is diagonalizable iff $f(A) = 0$ and a $g$ s.t. $A$ is diagonalizable iff $g(A) \neq 0$. I meant neither such an $f$ nor such a $g$ can exist. (I discussed both because the OP’s two examples, invertible matrices and unitarily diagonalizable matrices, are different. The first set is open while the second is closed.) – David Gao Jul 05 '24 at 11:54
  • Is there any systematic way of understanding what these discontinuities mean, and whether we can recover any useful local analysis under stricter assumptions? – user3716267 Jul 05 '24 at 13:27
  • 3
    @user3716267 Not sure what exactly do you mean. I’m pretty sure you can adapt these examples to show that the set of diagonalizable matrices is dense but has no interior (at least in the complex case), so in that sense it is extremely badly behaved, topologically speaking. – David Gao Jul 05 '24 at 13:38
  • That paints a much clearer picture! – user3716267 Jul 05 '24 at 14:27
  • 5
    @DavidGao The set $\mathrm{Dist}$ of complex $n\times n$ matrices with $n$ distinct eigenvalues is open and dense, and every such matrix is diagonalizable. Thus the interior of the set of diagonalizable matrices is nonempty, and indeed dense. The set of non-diagonalizable matrices is contained in the complement of $\mathrm{Dist}$, which is the set of matrices whose characteristic polynomial has discriminant zero. This is a subvariety of the space of complex $n\times n$ matrices with (complex) codimension one. – Jim Belk Jul 05 '24 at 22:29
  • 1
    @JimBelk You’re right. My claim was wrong. The closure of the collection of non-diagonalizable matrices is only the collection of matrices with multiplicities in eigenvalues, the complement of $\text{Dist}$, in your notations, not the full space of $n \times n$ matrices. – David Gao Jul 06 '24 at 01:20
  • 3
    @user3716267 A correction, as pointed out by Jim Belk: the collection of diagonalizable matrices does have interior. Its interior is exactly the collection of matrices with all eigenvalues distinct, which is already dense in all matrices. – David Gao Jul 06 '24 at 01:24
  • 4
    For me, the first part is easier to see by noting that $\begin{pmatrix}1&1\0&1+\epsilon\end{pmatrix} $ is diagonalizable iff $\epsilon\ne 0$. – Hagen von Eitzen Jul 06 '24 at 22:27
  • And purely for the looks, $\binom{1 \ \epsilon}{1 \ 1}$ and $\binom{1 \ \epsilon}{0 \ 1}$ would be two cases that look more similar. – Jos Bergervoet Jul 07 '24 at 06:50
30

Here is a salvage: there is a continuous function $f(A)$ of a matrix $A$, even a polynomial function, which checks whether the eigenvalues of $A$ are distinct (over an algebraic closure). Any such matrix is diagonalizable, although not conversely, and this is frequently a useful substitute for the diagonalizability condition in proofs since it implies diagonalizability and is often easier to prove things about.

Namely, we can take $f(A)$ to be the discriminant $\Delta$ of the characteristic polynomial of $A$. This is a homogeneous polynomial of degree $n(n-1)$ in the coefficients of $A$ which is equal to $\prod_{i \neq j} (\lambda_i - \lambda_j)$ where $\lambda_i$ are the eigenvalues of $A$, and so it is nonzero iff $A$ has distinct eigenvalues. For example, when $n = 2$, if we write the entries of $A$ as

$$A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$$

then the characteristic polynomial is $\chi_A(t) = t^2 - (a + d) t + (ad - bc)$, and its discriminant is

$$\Delta = (a + d)^2 - 4(ad - bc) = \boxed{ (a - d)^2 + 4bc }.$$

You can check that this vanishes on all Jordan blocks, and more generally that it is zero iff the characteristic polynomial has a repeated root of $\frac{a+d}{2}$.

As David explains, the problems all stem from diagonalizable matrices with repeated eigenvalues, which are difficult to distinguish from non-diagonalizable matrices (so difficult no continuous function can distinguish them). If we had to work numerically to finite precision then it would actually be impossible to distinguish these, and it would even be impossible to numerically distinguish non-diagonalizable matrices from diagonalizable matrices with distinct eigenvalues some of which are very close together.

Edit: Here is some more discussion. We can consider Hagen von Eitzen's example of the matrices

$$X(\varepsilon) = \begin{bmatrix} 1 & 1 \\ 0 & 1 + \varepsilon \end{bmatrix}$$

which have distinct eigenvalues for $\varepsilon \neq 0$ but suddenly become non-diagonalizable at $\varepsilon = 0$. This is interesting because if a matrix has distinct eigenvalues then it has a basis of eigenvectors which is unique up to scaling (note that this is not true if it is diagonalizable but with repeated eigenvalues), but a non-diagonalizable matrix doesn't have a basis of eigenvectors. So, what happens to the eigenvectors of $X(\varepsilon)$ as $\varepsilon \to 0$?

One of them doesn't change at all, namely $v_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$, and this one remains an eigenvector as $\varepsilon \to 0$. So all of the interesting action must be happening with the other eigenvector, associated to the eigenvalue $1 + \varepsilon$. This eigenvector can be taken to be

$$v_{1 + \varepsilon} = \begin{bmatrix} 1 \\ \varepsilon \end{bmatrix}.$$

So now we can see what happens: as $\varepsilon \to 0$ not only do the eigenvalues collide, but the eigenvectors also collide! We can also see that if we had to work numerically to finite precision and $\varepsilon$ was very small, it would be very difficult to tell these eigenvectors apart numerically.

This suggests the general idea that we can think of matrices who have some eigenvalues very close to each other as having eigenvectors which are also very close to each other, which collide if we collide their eigenvalues. Matrices like this are "barely diagonalizable"; it's hard to distinguish these very close eigenvectors, and it gets harder the closer the eigenvalues are.

Qiaochu Yuan
  • 468,795
  • Beautiful answer - thank you so much for the added detail/structure. – user3716267 Jul 21 '24 at 16:27
  • I wonder - can we develop from this a notion of "extent" of diagonalizability, in terms of how easy the eigenvectors are to distinguish? This seems related to problems in statistical inference... – user3716267 Jul 21 '24 at 16:34
  • 2
    @user3716267: I won't venture a precise definition but loosely speaking you'd want to measure something like the size of the discriminant, or the distance to the discriminant locus (the locus $\Delta = 0$ of matrices with repeated eigenvalues), or the minimum difference between two eigenvalues, something along those lines. This might be related to the notion of "pseudospectrum": https://en.wikipedia.org/wiki/Pseudospectrum, https://terrytao.wordpress.com/2008/10/28/when-are-eigenvalues-stable/. – Qiaochu Yuan Jul 21 '24 at 17:19
  • Perfect, thanks for the reference! – user3716267 Jul 21 '24 at 22:25
9

A matrix is diagonalizable if and only if its minimal polynomial has distinct roots and a real matrix is diagonalizable over $\mathbb{R}$ iff additionally all the roots are real. But given a real or complex polynomial $f(x)$ both of these criteria can be checked without too much difficulty:

  • $f$ has multiple roots over $\mathbb{C}$ iff $\gcd(f, f') \ne 1$ which can be verified using the Euclidean algorithm.
  • Supposing $f$ is real and does have distinct roots over $\mathbb{C}$, https://math.stackexchange.com/a/4087390 describes a process to find how many real roots it has, thereby checking if all the roots are real.

Thus it suffices to find, given an $n \times n$ matrix $A$ say over $F = \mathbb{R}$ or $\mathbb{C}$, its minimal polynomial $f(x)$, ie a monic polynomial $f(x)$ of minimal degree such that $f(A) = 0$. Such a polynomial of degree $k$ is a linear dependence between $1, A, A^2, \dots, A^{k} \in F^{n^2}$ with the coefficient of $A^{k}$ equal to $1$. For a given $k$, Gaussian elimination either shows that there is no linear dependence or produces one. Dividing by the coefficient of highest power of $A$ that appears, any non-trivial dependence can be turned into a monic polynomial of degree $\le k$. In particular it suffices to check each $k = 1, 2, \dots$ (and in fact some $k \le n$ must work because of the characteristic polynomial).

This is likely not the most efficient algorithm, eg Gaussian elimination can likely be replaced by something faster, and it may be better to do a binary search for $k$ rather than check them one by one.

ronno
  • 12,914