1

Let $A$ be a complex normal $n\times n$ matrix. My exercise is to show that there exists a $n\times n$ matrix $B$ for which $A=B^2$. The hint is to use the spectral theorem. Here is what I have been thinking:

Since $A$ normal, there exists unitary $U$, such that $D:=U^{-1}AU$ is a diagonal matrix. Since, say, $D=\operatorname{diag}(a_1,\dots, a_n)$, there exists $b_i\in\mathbb{C}$ such that $a_i=b_i^2$, and so we can write $D=E^2$, where $E=\operatorname{diag}(b_1,\dots, b_n)$. Thus, we get $A=UDU^{-1}=UEEU^{-1}$. Here, I thought about putting $B=UE$ to complete the answer, but then it would be wrong as $B\neq EU^{-1}$. What would you do? Also, is it possible to prove it without using spectral theorem?

Thank you

  • Let $B=UEU^{-1}$. Then $B^2=UEU^{-1}UEU^{-1}=UE^2U^{-1}=UDU^{-1}=A$. Cf. this – J. W. Tanner Jan 11 '24 at 01:19
  • @J.W.Tanner I see, thanks for showing the missing part; I never have thought anything like that to include $I=U^{-1}U$ somewhere. – Mr.MathDoctor Jan 11 '24 at 01:21
  • The statement in question is obviously wrong. Consider the 1-by-1 real matrix $A=-1$. It is normal and it hasn't any square root over $\mathbb R$. The problem statement can be corrected by allowing complex matrix square root. – user1551 Jan 11 '24 at 02:41
  • @user1551 The matrix $A$ in the post was supposed to be complex, which I forgot to mention ... – Mr.MathDoctor Jan 11 '24 at 03:34

2 Answers2

1

As others have noted, taking $B = UEU^{-1}$ works. This approach is in a sense motivated by the general fact that $A^n = UD^n U^{-1}$ (for integers $n$) and plugging in $n = 1/2$.

As an alternative to the spectral theorem, it suffices to use the fact that for normal matrices, $\ker(A) = \text{im}(A)^\perp$, where $\ker$ denotes the kernel (AKA nullspace) and im denotes the image (AKA range or column space). With that, we can deduce that for a suitable unitary matrix $U$, we have $$ UAU^{-1} = \pmatrix{A_0 & 0\\0 & 0}, $$ where $A_0$ is square and invertible. With that, we can use the fact that there exists a matrix $B_0$ satisfying $B_0^2 = A_0$ (cf. this post, this post, or this post) and take $$ B = U^{-1}\pmatrix{B_0 &0\\0 & 0}U. $$


Proof that $\ker(A) = \text{im}(A)^\perp$: it helps to first use the fact that for all $x \in \Bbb C^n$, we have $\|Ax\| = \|A^*x\|$. Indeed, $$ \|Ax\|^2 = \langle Ax, Ax \rangle = \langle A^*Ax , x \rangle = \langle AA^*x,x \rangle = \langle A^* x, A^*x \rangle = \|A^*x\|^2. $$ With that, we deduce that $\ker(A) = \ker(A^*)$. So, we can use the general fact that $\ker(A^*) = \text{im}(A)^\perp$ to reach the desired result.

Ben Grossmann
  • 234,171
  • 12
  • 184
  • 355
0

Below is a slightly more general result.

  • Proposition. Let $r$ be a positive integer. In any algebraically closed field $\mathbb F$ whose characteristic is either zero or greater than $r$, every square matrix $A$ such that $\operatorname{im}(A)\cap\ker(A)=0$ has a matrix $r$-th root.

In terms of eigenvalues, the condition that $\operatorname{im}(A)\cap\ker(A)=0$ means the zero eigenvalues of $A$ (if any) are semisimple. In other words, over the algebraic closure of $\mathbb F$, the Jordan form of $A$ has not any nontrivial nilpotent Jordan block. This condition is crucial. E.g. $J=\pmatrix{0&1\\ 0&0}$ does not possess any square root $R$, or else $R$ must be nilpotent (because $R^4=J^2=0$) and hence $R^2=0$ (because $R$ is $2\times2$), but this leads to a contradiction because $R^2$, by definition, is identical to the nonzero matrix $J$.

Computationally, the zero intersection condition is equivalent to $\operatorname{rank}(A)=\operatorname{rank}(A^2)$ or $\operatorname{nullity}(A)=\operatorname{nullity}(A^2)$, but this is not our concern here.

As shown in Ben Grossman's answer, for a normal matrix $A$, this zero intersection condition is satisfied because $\ker(A)=\operatorname{im}(A)^\perp$. Since $\mathbb C$ is an algebraically closed field of characteristic zero, $A$ has a complex square root by the proposition above.

To prove the aforementioned proposition, note that by the zero intersection condition, the whole vector space is the direct sum of $\operatorname{im}(A)$ and $\ker(A)$. Hence we may assume that $A$ is similar to $$\pmatrix{B&0\\ 0&0}$$ for some non-singular matrix $B$. In turn, it suffices to prove the proposition in the special case that $A$ is non-singular. In this regard, let $f$ be the characteristic polynomial of $A$. Then $f(0)\ne0$. By a purely algebraic but very elementary argument (see Maxime Bôcher (1907), Introduction to Higher Algebra, pp.297-299); Bôcher's argument was also outlined in my other answer), one may construct two polynomials $g,\gamma\in\mathbb F[x]$ such that $f(x)g(x)+x=\gamma(x)^r$. Since $f(A)=0$ by Cayley-Hamilton theorem, we get $A=\gamma(A)^r$, i.e. $\gamma(A)$ is a matrix $r$-th root of $A$ over $\mathbb F$.

(I always visit this site using the Tor Browser. Due to recent changes in code/infrastructure on this site, I am told that "Editing is currently forbidden". I cannot edit my answer to fix any possible error and I cannot respond to comments. I apologize for any inconvenience caused.)