2

Let $X(\Omega)$ be a positive-semi-definite matrix which is a function of a set of parameters $\Omega$. I am interested in both cases where the matrix is real, or is Hermitian.

What is the derivative of the square root of this matrix with respect to an individual parameter $\Omega_i$, i.e $ {\partial_{\Omega_i}\sqrt{X(\Omega)}} $ ? Can this derivative be reduced to a form in terms of ${\partial_{\Omega_i}X(\Omega)}$?

Cei328
  • 95
  • Which kind of formula are you interested in? You may use perturbation theory (I assume your matrix is hermitian, the convention is not always standard) – lcv Feb 05 '20 at 00:05
  • Otherwise you can get a formula via resolvent so to soeak. An integral of some stuff over the complex plane on a curve that encircle the eigenvalues. – lcv Feb 05 '20 at 00:07
  • I would like an expression in terms of the non-square root derivative, $\partial_{\Omega_i} X(\Omega)$. I am interested in results for both real and Hermitian matrices, I will edit my question to say this! – Cei328 Feb 05 '20 at 00:29

3 Answers3

3

For typing convenience define the matrices $$ S=\sqrt{X},\quad \dot S=\frac{dS}{d\Omega_i},\quad \dot X=\frac{dX}{d\Omega_i},\quad M=\left(I\otimes S+S^T\otimes I\right)^{-\tt1} $$

Utilizing the vec operation one can proceed as follows. $$\eqalign{ SS &= X \\ S\dot S + \dot SS &= {\dot X} \\ (I\otimes S+S^T\otimes I)\operatorname{vec}(\dot S) &= \operatorname{vec}({\dot X}) \\ \operatorname{vec}(\dot S) &= M\operatorname{vec}({\dot X}) \\ \dot S &= \operatorname{reshape}\left(M\operatorname{vec}\big({\dot X}\big),\; {\rm size}\big(S\big)\right) \\ }$$ If $M$ does not exist, then there is no solution but it might be possible to use the Moore-Penrose pseudoinverse to obtain a least-squares solution.

greg
  • 40,033
1

You can use the Dunford-Taylor-Cauchy integral formula to define the square root of a matrix:

$$ \sqrt{X} = \frac{1}{2\pi i } \oint_\Gamma \sqrt{z} \frac{dz}{z-X} $$

where $\Gamma$ is a closed curve that encircles all the eigenvalues of $X$ in anticlockwise direction. This curve can be taken far away from the eigenvalues such that it is un-affected by the perturbation (when computing the derivative).

Furthermore use

$$ \frac{d}{dt} \frac{1}{z-X} = \frac{1}{z-X} X' \frac{1}{z-X}, $$

(prime indicates differentiation wrt to $t$). All in all we get

$$ \frac{d}{dt} \sqrt{X} = \frac{1}{2\pi i } \oint_\Gamma \sqrt{z} dz \frac{1}{z-X} X' \frac{1}{z-X}.\ \ \ \ \ (1) $$

A convenient expression can be obtained going to the spectral representation of $X$:

$$ X = \sum_n \lambda_n P_n \ \ \ \ \ (2) $$

with $\lambda_n, P_n$ respectively eigenvalues, eigenprojectors. Plugging it into (1) and evaluating the residues we get

\begin{align} \frac{d}{dt} \sqrt{X} &= \sum_n \frac{1}{2\sqrt{\lambda_n}} P_n X' P_n \\ & + \sum_{n\neq m} \frac{\sqrt{\lambda_n}-\sqrt{\lambda_m} }{\lambda_n - \lambda_m} P_n X' P_m \ \ \ (3) \end{align}

Apparently Eq. (3) is not valid if one of the eigenvalues is zero, pretty much as in @greg's answer. However, looking carefully at the residues one realizes that if there is a $\lambda_{n'}=0$ term that residue is zero. In other words, simply remove $n'$ from the first sum in (3).

With these tweaks Eq. (3) is valid in full generality.

lcv
  • 2,730
  • That does not work in $t_0$ when $X(t_0)$ is not invertible. The simplest counter-example: take $X(t)=t^2$ (analytic real function); then $\sqrt{X(t)}=|t|$ is not differentiable in $0$. –  Feb 06 '20 at 14:06
  • It's true that when I wrote the derivative of $\sqrt{X}$ I implicitly assumed the said matrix to be differentiable. The same holds also for the other answer. But you can be differentiable and still have a zero eigenvalue. – lcv Feb 06 '20 at 14:44
  • That is exactly the question. $\sqrt{X}$ is not differentiable in such a case. Anyway, do as you want... –  Feb 06 '20 at 14:50
  • @loupblanc I don't understand what you're saying, $X$ can have a zero eigenvalue and still $\sqrt{X}$ can be differentiable. Btw I didn't realize the question is a duplicate and already had a perfectly valid answer(s). – lcv Feb 06 '20 at 21:41
  • I only say that if $X(u)$ is diff. and $X(u_0)$ invertible, then $S(u)$ is diff in $u_0$; moreover $()$ $S'(u_0)$ depends only on $S(u_0)$ and $X'(u_0)$; $()$ is no longer true when $S(u)$ is diff. in $u_0$ and $X(u_0)$ has several $0$-eigenvalues. –  Feb 07 '20 at 10:11
1

There are two explicit forms of the required derivative.

i) We use the greg's method, that reduces to solving (in $S'$)

$SS'+S'S=X'$. There is $P\in O(n)$ s.t. $X=Pdiag(\lambda_i)P^T$ and $S=Pdiag(\sqrt{\lambda_i})P^T$; let $K=[k_{i,j}]=P^TS'P$ and $H=[h_{i,j}]=P^TX'P$.

We deduce the equation in $K$: $diag(\sqrt{\lambda_i})K+Kdiag(\sqrt{\lambda_i})=H$.

We obtain easily $k_{i,j}=\dfrac{h_{i,j}}{\sqrt{\lambda_i}+\sqrt{\lambda_j}}$ and $S'=PKP^T$.

ii) We use a real convergent integral $S'=\int_0^{\infty}e^{-tS}X'e^{-tS}dt$.

For the details, see my post in

Derivative (or differential) of symmetric square root of a matrix