Assume $A \in \mathbb{R}^{m \times n}$ and $X \in \mathbb{R}^{n \times r}$, we want to find the derivative of $f(A(X \circ X)) \in \mathbb{R}$ over $X$.
I think $\nabla_{X}f(A(X \circ X)) \in \mathbb{R}^{n \times r}$, but $\nabla_X f(A(X \circ X)) = \nabla_{A(X \circ X)}f(A(X \circ X)) \frac{d A(X \circ X)}{d X}$, where $\nabla_{AX}f(AX) \in \mathbb{R}^{m \times r}$,
where $(X \circ X) = X_{ij}^2$.
I am confused how to calculate $\frac{d A(X \circ X)}{d X}$ and what is the dimension of it such that $\nabla_{X}f(A(X \circ X)) \in \mathbb{R}^{n \times r}$.
According to generic rule matrix differentiation (Hadamard Product, element-wise), it seems I can get $$\begin{equation} \begin{aligned} d f(A(X\circ X)) & = \nabla_{A(X\circ X)}f: d(A(X\circ X)) \\ & = A^T (\nabla_{A(X\circ X)}f):d(X\circ X) \\ & = A^T (\nabla_{A(X\circ X)}f):2X \circ dX \\ & = A^T (\nabla_{A(X\circ X)}f) \circ 2X dX \\ \end{aligned} \end{equation}$$ $$ I am not sure if the last equation hold since I have no idea to deal with $2X \circ dX$.