1

Let matrices $A$ and $B$ be symmetric and positive semidefinite (PSD). I'm trying to calculate the derivative of the Bures distance

$$d(A,B) := \mbox{tr} \left( A + B - 2 \left( A^{1/2} B A^{1/2} \right)^{1/2} \right)$$

with respect to $A$. I'm having issues with the last term $$2(A^{1/2}BA^{1/2})^{1/2}$$ involving matrix square roots. Does anyone know a nice way to compute this?

user550103
  • 2,773
Yannik
  • 1,565

1 Answers1

2

As pointed out in the comments, for PSD matrices a drastic simplification is possible: $$\eqalign{ {\rm Tr}((A^{1/2}BA^{1/2})^{1/2}) &= {\rm Tr}((BA)^{1/2}) \\ }$$ In addition, there is a general result for the differential of the trace of any matrix function $$\eqalign{ d\,{\rm Tr}\big(f(X)\big) &= f'(X^T):dX \\ }$$ where $f'$ is the ordinary derivative of the scalar function $f;\,$ both $f$ and $f'$ are evaluated using their respective matrix arguments.

Combining these yields a straightforward solution for the problematic term $$\eqalign{ \phi &= {\rm Tr}\Big((BA)^{1/2}\Big) \\ d\phi &= \tfrac 12\big((BA)^T\big)^{-1/2}:d(BA) \\ &= \tfrac 12(AB)^{-1/2}:B\,dA \\ &= \tfrac 12 B(AB)^{-1/2}:dA \\ \frac{\partial\phi}{\partial A} &= \tfrac 12 B(AB)^{-1/2} \;=\; \tfrac 12 (BA)^{-1/2}B \\ }$$ Where the final equality is a theorem due to Higham $$B\cdot f(AB) = f(BA)\cdot B$$

Therefore the gradient of the Bures Distance is $$\eqalign{ \beta(A,B) &= {\rm Tr}\Big(A+B - 2(BA)^{1/2} \Big) \\ d\beta &= \Big(I - B(AB)^{-1/2}\Big):dA \\ \frac{\partial\beta}{\partial A} &= I - B(AB)^{-1/2} \;\;=\; I - (BA)^{-1/2}B \\ &= I - A^{-1}(AB)^{1/2} \;=\; I - (BA)^{1/2}A^{-1} \\ }$$ All four gradient expressions are equivalent, and although it's not immediately obvious, the gradient is a symmetric matrix.

The gradient wrt $B$ can be derived in an analogous manner. $$\eqalign{ \frac{\partial\beta}{\partial B} &= I - A(BA)^{-1/2} \;\;=\; I - (AB)^{-1/2}A \\ &= I - B^{-1}(BA)^{1/2} \;=\; I - (AB)^{1/2}B^{-1} \\ }$$

greg
  • 40,033