$$ \left\|Y-XX^T \right\|_{\text{F}}^2$$
where $X,Y$ are matrices. Taking derivative w.r.t $X$ yields
$$-2(Y-XX^T)X$$
Why is this so?
$$ \left\|Y-XX^T \right\|_{\text{F}}^2$$
where $X,Y$ are matrices. Taking derivative w.r.t $X$ yields
$$-2(Y-XX^T)X$$
Why is this so?
Some notations:
Let $f := \left\|Y - XX^T\right\|_F^2 \equiv Y - XX^T:Y - XX^T$.
Obtain the differential followed by the gradient (aka Jacobian). \begin{align} df &= d\left(Y - XX^T:Y - XX^T \right) \\ &= \left[\left(-dXX^T - XdX^T \right):Y - XX^T \right] + \left[Y - XX^T : \left(-dXX^T - XdX^T \right)\right] \\ &= -2 \left(Y - XX^T\right) : \left(dXX^T + XdX^T \right) \\ &= \left[-2 \left(Y - XX^T\right) : dXX^T \right] + \left[-2 \left(Y - XX^T\right) : XdX^T \right]\\ &= \left[-2 \left(Y - XX^T\right)X : dX \right] + \left[-2 X^T\left(Y - XX^T\right) : dX^T \right]\\ &= \left[-2 \left(Y - XX^T\right)X : dX \right] + \left[-2 \left(Y^T - XX^T\right)X : dX \right]\\ \end{align}
Thus, the gradient is \begin{align} \frac{\partial f}{\partial X} = -2\left[\left(Y + Y^T \right) - 2XX^T\right]X. \end{align}