2

Suppose $x \in \mathbb R^d$, $W \in \mathbb R^{d\times c}$, and $L \in \mathbb R^{c \times c}$ is diagonal. How do I differentiate $x^TWLW^Tx$ with respect to $W$?

EKM
  • 33
  • 1
    As usual: compute $f(W+tH)$ and check the limit of $(f(W+tH)-f(W))/t$ when $t\to0$. – Did Oct 08 '16 at 11:39

2 Answers2

3

Let $f : \mathbb R^{m \times n} \to \mathbb R$ be defined by

$$f (\mathrm X) := \mathrm a^\top \mathrm X \mathrm B \mathrm X^\top \mathrm a$$

where $\mathrm a \in \mathbb R^m$ and $n \times n$ symmetric matrix $\mathrm B$ are given. Taking the differential,

$$\begin{array}{rl} \mathrm d f &= \mathrm a^\top (\mathrm d \mathrm X) \mathrm B \mathrm X^\top \mathrm a + \mathrm a^\top \mathrm X \mathrm B (\mathrm d \mathrm X)^\top \mathrm a\\ &= \mbox{tr} ( \mathrm a^\top (\mathrm d \mathrm X) \mathrm B \mathrm X^\top \mathrm a) + \mbox{tr} ( \mathrm a^\top \mathrm X \mathrm B (\mathrm d \mathrm X)^\top \mathrm a)\\ &= \mbox{tr} ( \mathrm B \mathrm X^\top \mathrm a\mathrm a^\top (\mathrm d \mathrm X) ) + \mbox{tr} ( (\mathrm d \mathrm X)^\top \mathrm a \mathrm a^\top \mathrm X \mathrm B)\\ &= \langle \mathrm a \mathrm a^\top \mathrm X \mathrm B^\top, \mathrm d \mathrm X \rangle + \langle \mathrm d \mathrm X, \mathrm a \mathrm a^\top \mathrm X \mathrm B \rangle\\ &= \langle \mathrm d \mathrm X, \mathrm a \mathrm a^\top \mathrm X (\mathrm B + \mathrm B^\top) \rangle\\ &= \langle \mathrm d \mathrm X, \color{blue}{2 \mathrm a \mathrm a^\top \mathrm X \mathrm B} \rangle\end{array}$$

and, thus, the gradient of $f$ is

$$\nabla f (\mathrm X) = 2 \mathrm a \mathrm a^\top \mathrm X \mathrm B$$


-1

using $\frac{\partial }{\partial X} trace(A X B X^T C)= BX^T CA + B^TX^TA^TC^T$ then $$\frac{\partial }{\partial W} x^T WLW^T x = \frac{\partial }{\partial W} trace(x^T WLW^T x) = LW^Txx^T + L^TW^Txx^T$$

Ahmad Bazzi
  • 12,238