Let $A \in \mathbb{R}^{n \times n}$ and $X, Y \in \mathbb{R}^{n \times r}$. Consider the function \begin{equation} H \left( X , Y \right) := \dfrac{1}{2} \left\lVert A - XY^{T} \right\rVert _{F}^{2} , \end{equation} where $\left\lVert \cdot \right\rVert _{F}$ denotes the Frobenius norm.
The first one seems to be easy. I used the chain rule to get \begin{equation} \nabla_{X} H \left( X , Y \right) = \left( A - XY^{T} \right) \nabla_{X} \left( - XY^{T} \right) = - \left( A - XY^{T} \right) Y^{T} . \end{equation}
For the second one, as $A - XY^{T} = A - \left( X^{T}Y \right) ^{T}$, we have \begin{align} \nabla_{Y} H \left( X , Y \right) = \left( \left( A - XY^{T} \right) \nabla_{Y} \left( - \left( X^{T}Y \right) ^{T} \right) \right) ^{T} & = \left( - \left( A - XY^{T} \right) X^{T} \right) ^{T} \\ & = - X \left( A - XY^{T} \right) ^{T} . \end{align}
Is my $\nabla_{Y} H \left( X , Y \right)$ formula correct?
And is there other approaches to compute the gradient. I guess we can compute $H \left( X + \delta X , Y \right)$ then deduce the gradient from the difference $H \left( X + \delta X , Y \right) - H \left( X , Y \right)$.