16

Let the square matrix $A$ be symmetric and positive definite. Let $S$ be its symmetric square root found via singular value decomposition (SVD). Let $\operatorname{vech}$ be the half-vectorization operator. Is there a convenient expression for the derivative (or differential) of $\operatorname{vech} (S)$ with respect to $\operatorname{vech} (A)$?

I know the expression for an inverse, which is sort of like a matrix version of the power rule. Would this approach work for a symmetric square root as well (i.e., (1/2)S^(-1/2))?

Scott
  • 313
  • You can easily get an expression for $dS$ in terms of $A$ and $dA$ using the product rule applied to $d(S^2)$. This will involve some matrix multiplication; I'm not sure what the final expression in terms of "half-vectorization" would look like. – Anthony Carapetis Oct 26 '13 at 14:40
  • Ah, thanks. Yes, it amounts to finding d(A), then isolating the d(S) terms. – Scott Oct 26 '13 at 16:06

3 Answers3

19

What follows is an extension of the previous comments, to derive an explicit expression in terms of Kronecker sum. Taking differential $\mathrm{d}(\cdot)$ to both sides of $\sqrt{A}\sqrt{A} = A$ results a special case of Sylvester equation $$(\mathrm{d}\sqrt{A}) \sqrt{A} \: + \: \sqrt{A} (\mathrm{d}\sqrt{A}) = \mathrm{d}A, $$ which can be solved for the differential matrix $\mathrm{d}\sqrt{A}$ as $$ \text{vec}(\mathrm{d}\sqrt{A}) = \left(\sqrt{A}^{\top} \oplus \sqrt{A}\right)^{-1} \: \text{vec}(\mathrm{d}A). $$ Since $A$ is positive definite, $\sqrt{A}$ is unique and positive definite, and hence the Kronecker sum is positive definite (thus non-singular). Further, since the differential and vec operator can be interchanged in the left hand side of the equation above, the Jacobian identification rule (p. 198 in Magnus and Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, 3rd ed., chapter 9, section 5) results $$ \mathrm{D}\sqrt{A} = \left(\sqrt{A}^{\top} \oplus \sqrt{A}\right)^{-1}, $$ where the transpose can be dispensed if $A$, in addition to being positive-definite, is also symmetric, as the OP asked. Notice that for generic matrix function $F: \mathbb{R}^{p\times q} \mapsto \mathbb{R}^{m\times n}$, the Jacobian is defined as $\mathrm{D}F(X) \triangleq \displaystyle\frac{\partial \: \text{vec}(F(X))}{\partial \: (\text{vec}(X))^{\top}}$, and is of size $mn \times pq$.

  • Can you explain how did you get $\text{vec}(\mathrm{d}\sqrt{A}) = \left(\sqrt{A}^{\top} \oplus \sqrt{A}\right)^{-1} : \text{vec}(\mathrm{d}A)$? – XYZABC Sep 18 '19 at 14:05
  • 2
    The step before is a special type of Sylvester equation $PX + XQ = R$, whose solution is known to be expressible in vectorized form: $(I \otimes P + Q^{\top} \otimes I) \text{vec}(X) = \text{vec}(R)$. For us, $P = Q = \sqrt{A}$, $X = {\rm{d}}\sqrt{A}$, $R = {\rm{d}}A$. Then use the definition of Kronecker sum. – Abhishek Halder Sep 24 '19 at 07:41
11

The function $A\rightarrow S=\sqrt{A}$ is defined and differentiable on the set of SPD matrices. Let $K=DS_A(H)$ be the derivative of $S$ in $A$, where $H$ is a variable SYMMETRIC matrix. Here $SS=A$ implies $KS+SK=H$, a Sylvester equation in the unknown $K$. We may assume $A=diag((\lambda_i)_i)$ where $\lambda_i>0$. Then $S=diag((\sqrt{\lambda_i})_i)$. If $H=[h_{i,j}],K=[k_{i,j}]$, then, by an easy identification, we obtain $k_{i,j}=\dfrac{h_{i,j}}{\sqrt{\lambda_i}+\sqrt{\lambda_j}}$. Of course, if $n=1$, we find the usual derivative $h_{1,1}\rightarrow \dfrac{h_{1,1}}{2\sqrt{\lambda_1}}$.

EDIT 1. About the half-vectorization operator, we can store half of matrix $K$ because it is symmetric as the matrix $H$.

EDIT 2. Another form of $K$ is $\int_0^{\infty}e^{-tS}He^{-tS}dt$. That implies that if $H$ is a small symmetric matrix, then $\sqrt{A+H}\approx \sqrt{A}+\int_0^{\infty}e^{-t\sqrt{A}}He^{-t\sqrt{A}}dt$.

EDIT 3. Proof of the above result. The integral converges (easy) and it suffices to prove that $KS+SK=H$ (the solution in $K$ of this equation is unique). One has $KS+SK=\int_0^{+\infty}e^{-tS}He^{-tS}S+Se^{-tS}He^{-tS}dt=-\int_0^{+\infty}(e^{-tS}He^{-tS})'dt=H.$

6

To summarize Abhishek's post $$\eqalign{ A &= S^2,\quad a={\rm vec}(A),\quad s={\rm vec}(S),\quad M=S^T\oplus S \cr ds &= M^{-1}\,da \cr }$$ However, to answer the original question, one must introduce Duplication and Elimination matrices. $$\eqalign{ \alpha &= {\rm vech}(A),\quad \alpha=L_na,\quad a=D_n\alpha \cr \sigma &= {\rm vech}(S),\quad \sigma=L_ns,\quad s=D_n\sigma \cr L_n\,ds &= L_nM^{-1}(D_nL_n\,da) \cr d\sigma &= L_nM^{-1}D_n\,\,d\alpha \cr \frac{\partial\sigma}{\partial\alpha} &= L_nM^{-1}D_n \cr }$$

greg
  • 40,033