0

$$ \left\|Y-XX^T \right\|_{\text{F}}^2$$

where $X,Y$ are matrices. Taking derivative w.r.t $X$ yields

$$-2(Y-XX^T)X$$

Why is this so?

Parcly Taxel
  • 105,904
Ben
  • 11
  • Did you try to search within MSE? There are several questions like this. For instance, see https://math.stackexchange.com/questions/2128462/derivative-of-squared-frobenius-norm-of-a-matrix?rq=1. I hope that helps. – user550103 Oct 20 '20 at 09:35
  • Yes. But here the variables occur 2 times. The things become different. – Ben Oct 21 '20 at 02:38

1 Answers1

1

Some notations:

  • Trace and Frobenius product relation $$\left\langle A, B C\right\rangle={\rm tr}(A^TBC) := A : B C$$
  • Cyclic properties of Trace/Frobenius product \begin{align} A : B C &= BC : A \\ &= A C^T : B \\ &= A^T: C^TB^T \\ &= {\text{etc.}} \cr \end{align}

Let $f := \left\|Y - XX^T\right\|_F^2 \equiv Y - XX^T:Y - XX^T$.

Obtain the differential followed by the gradient (aka Jacobian). \begin{align} df &= d\left(Y - XX^T:Y - XX^T \right) \\ &= \left[\left(-dXX^T - XdX^T \right):Y - XX^T \right] + \left[Y - XX^T : \left(-dXX^T - XdX^T \right)\right] \\ &= -2 \left(Y - XX^T\right) : \left(dXX^T + XdX^T \right) \\ &= \left[-2 \left(Y - XX^T\right) : dXX^T \right] + \left[-2 \left(Y - XX^T\right) : XdX^T \right]\\ &= \left[-2 \left(Y - XX^T\right)X : dX \right] + \left[-2 X^T\left(Y - XX^T\right) : dX^T \right]\\ &= \left[-2 \left(Y - XX^T\right)X : dX \right] + \left[-2 \left(Y^T - XX^T\right)X : dX \right]\\ \end{align}

Thus, the gradient is \begin{align} \frac{\partial f}{\partial X} = -2\left[\left(Y + Y^T \right) - 2XX^T\right]X. \end{align}

user550103
  • 2,773