Which function on $\mathbf{X}$ has gradient $\mathbf{A} \mathbf{A}^\top \mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1} $?

Question

I learned from The Matrix Cookbook that the gradient of the $\log \det$ function is given by

\begin{equation} \nabla \log \text{det}(\mathbf{X}^\top \mathbf{X})=2\mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1}, \end{equation}

where $\mathbf{X}\in\mathbb{R}^{n\times r}$. I wonder which function will give the gradient

\begin{equation} 2\mathbf{A} \mathbf{A}^\top \mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1}, \end{equation}

for some matrix $\mathbf{A}\in \mathbb{R}^{n\times r}$.

The gradient of a function is a vector not a matrix. Your first equation does not make sense for me. — Dog_69, Aug 04 '18 at 21:29
Hi, I'm arranging it in a matrix form. It is similar to the fact that the gradient of $\frac{1}{2}|\mathbf{X}|_{\mathrm{F}}^2$ is $\mathbf{X}$. — Wuchen, Aug 04 '18 at 22:18
In this case the $r$ of $\mathbb R^{n\times r}$ should be $1$1. — Dog_69, Aug 05 '18 at 05:04
Because vectors belong to $\mathbb R^n$. You can arrange an $n$-tuple as a column (or row) matrix, but then you get an $n\times 1$ ($1\times n$) matrix. — Dog_69, Aug 06 '18 at 15:13
@Dog_69 See https://math.stackexchange.com/questions/2807864/derivative-of-the-trace-of-the-product-of-a-matrix-and-its-transpose/2809102#2809102 — Jean-Claude Arbaut, Aug 07 '18 at 06:03

score 1 · Accepted Answer · 2018-08-09T09:32:44.617

In general, there is no solution. That follows is a counter-example for the existence of $g$ s.t. $\nabla(g)=2BX(X^TX)^{-1}$ for every $X$ s.t. $rank(X)=r$ (where $B\in M_n$).

Let $n=2,r=1, B=\begin{pmatrix}a&b\\c&d\end{pmatrix},X=[x,y]^T\not= [0,0]^T$.

$\dfrac{\partial g}{\partial x}=\dfrac{2}{x^2+y^2}(ax+by),\dfrac{\partial g}{\partial y}=\dfrac{2}{x^2+y^2}(cx+dy)$. Thus

$\dfrac{\partial^2 g}{\partial x \partial y}=\dfrac{2}{(x^2+y^2)^2}(b(x^2-y^2)-2axy),\dfrac{\partial^2 g}{\partial y \partial x}=\dfrac{2}{(x^2+y^2)^2}(c(y^2-x^2)-2dxy)$.

Thus $g$ exists iff $a=d,b=-c$, that is $B=\begin{pmatrix}a&b\\-b&a\end{pmatrix}$; in particular, $B$ has never the form $AA^T$ except when $B=0$.

EDIT. The general solution of the above equation -with the $(r,\theta)$ polar coordinates- is: $g=2a\log(r)-2b\theta+$ constant.

Thank you! Sorry for the late acceptance. – Wuchen Oct 23 '18 at 19:02 — Wuchen, Oct 23 '18 at 19:02

Which function on $\mathbf{X}$ has gradient $\mathbf{A} \mathbf{A}^\top \mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1} $?

1 Answers1