0

How to prove if $B$ commutes with a positive semidefinite matrix $A$, then $B$ commutes with $\sqrt A$.

The logic of notes I refer to is like this: first if A is positive semidefinite, then $$\sqrt A=\text{polynomial of } A\\ =P\begin{pmatrix}\sqrt{\lambda_1}&&\\&\sqrt{\lambda_2}\\&&\ddots\end{pmatrix}P^T.$$ And by this theorem we can deduce that if $B$ commutes with a positive semidefinite matrix $A$, then $B$ commutes with $\sqrt A$. Using the fact that if $B$ commutes with $A$, then B commutes with some polynomials of $A$. But actually I don't quite understand what is a polynomial of a matrix. And how can we show that if $B$ commutes with $A$, then B commutes with some polynomials of $A$.

Arthur
  • 204,511
Jerry
  • 135

2 Answers2

4

A polynomial of a matrix means a "substitution" of the matrix for the indeterminate $x$ into a polynomial in $x$. E.g. if $f(x)=c_0+c_1x+c_2x^2+\cdots+c_mx^m$ is a polynomial with complex coefficients, then $f(A)$ means the sum $c_0I+c_1A+c_2A^2+\cdots+c_mA^m$.

Let $f$ be any polynomial such that $f(\lambda_i)=\sqrt{\lambda_i}$ for every $i$ (e.g. take $f$ as a Lagrange interpolation polynomial). Then $f(A)=f(P\Lambda P^{-1})=Pf(\Lambda)P^{-1}=P\sqrt{\Lambda}P^{-1}=\sqrt{A}$.

Now, if $B$ commutes with $A$, then $B$ also commutes with all nonnegative integer powers of $A$: $BA^k=(BA)A^{k-1}=ABA^{k-1}=A(BA)A^{k-2}=AABA^{k-2}=\cdots=A^kB$. Hence $B$ commutes with all linear combinations of nonnegative integer powers of $A$, i.e. $B$ commutes with all polynomials in $A$. Since $\sqrt{A}$ is a polynomial in $A$, the conclusion follows.

user1551
  • 149,263
  • I am still confused that for the first paragraph, the input of f is a matrix A, but in the first line of the second paragraph, the input is one eigenvalue. Could you elaborate more about these two differences to me? – Jerry Nov 12 '20 at 11:34
  • 1
    @Jerry Yes, we overload the symbol $f$ to mean a polynomial and also its evaluations at different objects (scalars and matrices in this case). E.g. suppose that $f(x)=ax+b$. Then $f(\Lambda)$ is defined as $a\Lambda +bI$. Hence $f(\Lambda):=a\Lambda +bI=a\pmatrix{\lambda_1\ &\lambda_2}+bI=\pmatrix{a\lambda_1+b\ &a\lambda_2+b}=\pmatrix{f(\lambda_1)\ &f(\lambda_2)}$. That is, for a diagonal matrix $\Lambda$, the value of $f(\Lambda)$ is just the diagonal matrix whose diagonal entries are the $f(\lambda_i)$s. – user1551 Nov 12 '20 at 11:51
1

If $\{v_1,\ldots,v_n\}$ is a basis of $\mathbb R^n$ with eigenvectors of $A$, i.e., $Av_k=\lambda_kv_k$, $\lambda_k\ge 0$, then $$ ABv_k=BAv_k=B(\lambda_kv_k)=\lambda_k Bv_k, \quad k=1,\ldots,n. $$ Let $V(\lambda_k)$ be the eigen-space of $\lambda_k$, i.e., $Aw=\lambda_kw$, for all $w\in V(\lambda_k)$.

Claim. $B[V(\lambda_k)]\subset V(\lambda_k).$

Proof of the Claim. If $\lambda_k=0$, then $Av_k=0$, and $BAv_k=0$, and hence $ABv_k=BAv_k=0$. So $Bv_k\in V(\lambda_k)$. If $\lambda_k>0$, then $ABv_k=\lambda_kBv_k$, implies again that $Bv_k\in V(\lambda_k)$.

Let now $Cv_k=\lambda_k^{1/2}v_k$, $k=1,\ldots$, and hence $C^2=A$. By virtue of the Claim, $CBv_k=\lambda^{1/2}_kBv_k$.

If $v\in\mathbb R^n$, then $v=c_1v_1+\cdots+c_nv_n$, for some $c_1,\ldots,c_n\in\mathbb R$, and $$ BCv=\sum_{k=1}^nc_kBCv_k=\sum_{k=1}^nc_k\lambda^{1/2}_kBv_k= \sum_{k=1}^n c_kCBv_k=CBv. $$