0

I need to find the derivative wrt ${X}$ of:

$$ f({X}) = \operatorname{tr}( B^{T}X^{T}A^{T}CXD )$$

To make this simpler, I let $E = A^{T}C$, then I have:

$$ f({X}) = \operatorname{tr}( B^{T}X^{T}EXD ) = \operatorname{tr}( DB^{T}X^{T}EX )$$

We can rewrite this using the Frobenius inner product:

$$ f({X}) = XBD^{T}:EX$$

So we can calculate the differential easily now:

$$ df = (dX)BD^{T}:EdX$$

Unfortunately this is where I am stuck.


Perhaps we can rewrite this as:

$$ df = BD^{T}:(dX)^{T} E dX$$

But how do we isolate the dX value?


Edits:

I tried: $(dX)^{T} E dX = \operatorname{tr}( dXdX^{T}E) = \operatorname{tr}( EdXdX^{T}) $ but that isn't getting me anywhere either...

1 Answers1

1

You get two terms, one for each factor involving $X$. You can manipulate the trace for each one separately. Apparently you know that the derivative of $\operatorname{tr}Y^\top X$ is $Y$. For the first term, apply this to $\operatorname{tr}(DB^\top X^\top EX)$ to obtain $E^\top X BD^\top$. For the second term, use the invariance of the trace under cyclic permutation and under transposition to write $\operatorname{tr}(DB^\top X^\top EX)=\operatorname{tr}(X^\top EXDB^\top)=\operatorname{tr}(BD^\top X^\top E^\top X)$ and obtain the second term $EXDB^\top$. Thus the derivative is $E^\top X BD^\top+EXDB^\top$.

joriki
  • 242,601
  • Ah okay so I missed the two term part...I am not quite sure how to get that out yet though. Ie where the two terms come from. – the_src_dude Dec 13 '19 at 16:01
  • @the_src_dude: It's basically the product rule. If you take the derivative of $d\cdot b\cdot x\cdot e\cdot x$ with respect to $x$ for scalars, you likewise get $d\cdot b\cdot e\cdot x+d\cdot b\cdot x\cdot e$. The only difference is that you need to manipulate the trace in two different ways to see what exactly the resulting terms should be. – joriki Dec 13 '19 at 18:20