4

I learned from The Matrix Cookbook that the gradient of the $\log \det$ function is given by

\begin{equation} \nabla \log \text{det}(\mathbf{X}^\top \mathbf{X})=2\mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1}, \end{equation}

where $\mathbf{X}\in\mathbb{R}^{n\times r}$. I wonder which function will give the gradient

\begin{equation} 2\mathbf{A} \mathbf{A}^\top \mathbf{X}(\mathbf{X}^\top \mathbf{X})^{-1}, \end{equation}

for some matrix $\mathbf{A}\in \mathbb{R}^{n\times r}$.

Wuchen
  • 175
  • The gradient of a function is a vector not a matrix. Your first equation does not make sense for me. – Dog_69 Aug 04 '18 at 21:29
  • 2
    Hi, I'm arranging it in a matrix form. It is similar to the fact that the gradient of $\frac{1}{2}|\mathbf{X}|_{\mathrm{F}}^2$ is $\mathbf{X}$. – Wuchen Aug 04 '18 at 22:18
  • In this case the $r$ of $\mathbb R^{n\times r}$ should be $1$1. – Dog_69 Aug 05 '18 at 05:04
  • Where does this constraint come from? – Wuchen Aug 06 '18 at 14:49
  • Because vectors belong to $\mathbb R^n$. You can arrange an $n$-tuple as a column (or row) matrix, but then you get an $n\times 1$ ($1\times n$) matrix. – Dog_69 Aug 06 '18 at 15:13
  • 3
    @Dog_69 See https://math.stackexchange.com/questions/2807864/derivative-of-the-trace-of-the-product-of-a-matrix-and-its-transpose/2809102#2809102 – Jean-Claude Arbaut Aug 07 '18 at 06:03
  • @Jean-Claude I hadn't seen that before. Thanks. – Dog_69 Aug 07 '18 at 11:22

1 Answers1

1

In general, there is no solution. That follows is a counter-example for the existence of $g$ s.t. $\nabla(g)=2BX(X^TX)^{-1}$ for every $X$ s.t. $rank(X)=r$ (where $B\in M_n$).

Let $n=2,r=1, B=\begin{pmatrix}a&b\\c&d\end{pmatrix},X=[x,y]^T\not= [0,0]^T$.

$\dfrac{\partial g}{\partial x}=\dfrac{2}{x^2+y^2}(ax+by),\dfrac{\partial g}{\partial y}=\dfrac{2}{x^2+y^2}(cx+dy)$. Thus

$\dfrac{\partial^2 g}{\partial x \partial y}=\dfrac{2}{(x^2+y^2)^2}(b(x^2-y^2)-2axy),\dfrac{\partial^2 g}{\partial y \partial x}=\dfrac{2}{(x^2+y^2)^2}(c(y^2-x^2)-2dxy)$.

Thus $g$ exists iff $a=d,b=-c$, that is $B=\begin{pmatrix}a&b\\-b&a\end{pmatrix}$; in particular, $B$ has never the form $AA^T$ except when $B=0$.

EDIT. The general solution of the above equation -with the $(r,\theta)$ polar coordinates- is: $g=2a\log(r)-2b\theta+$ constant.