1

Assume $A \in \mathbb{R}^{m \times n}$ and $X \in \mathbb{R}^{n \times r}$, we want to find the derivative of $f(A(X \circ X)) \in \mathbb{R}$ over $X$.

I think $\nabla_{X}f(A(X \circ X)) \in \mathbb{R}^{n \times r}$, but $\nabla_X f(A(X \circ X)) = \nabla_{A(X \circ X)}f(A(X \circ X)) \frac{d A(X \circ X)}{d X}$, where $\nabla_{AX}f(AX) \in \mathbb{R}^{m \times r}$,
where $(X \circ X) = X_{ij}^2$.

I am confused how to calculate $\frac{d A(X \circ X)}{d X}$ and what is the dimension of it such that $\nabla_{X}f(A(X \circ X)) \in \mathbb{R}^{n \times r}$.

According to generic rule matrix differentiation (Hadamard Product, element-wise), it seems I can get $$\begin{equation} \begin{aligned} d f(A(X\circ X)) & = \nabla_{A(X\circ X)}f: d(A(X\circ X)) \\ & = A^T (\nabla_{A(X\circ X)}f):d(X\circ X) \\ & = A^T (\nabla_{A(X\circ X)}f):2X \circ dX \\ & = A^T (\nabla_{A(X\circ X)}f) \circ 2X dX \\ \end{aligned} \end{equation}$$ $$ I am not sure if the last equation hold since I have no idea to deal with $2X \circ dX$.

one user
  • 534
  • 1
    Let $G=\nabla f\in{\mathbb R}^{m\times r}$, then so far you have $$df = (A^TG):(2X\circ dX)$$ To finish it off, simply move the Hadamard product to the LHS and leave $dX$ on the RHS. $$\eqalign{ df &= \Big(2X\circ(A^TG)\Big):dX \ \frac{\partial f}{\partial X} &= \Big(2X\circ(A^TG)\Big) \ }$$ – greg Oct 19 '20 at 01:01

0 Answers0