0

I just learned about derivatives whose results can be expressed as matrices.

simple matrix derivatives

Those types of derivatives are useful. I can find lots of examples in Machine Learning or Optimization Theory. But, I also heard there are some other types of matrix derivatives, such as the matrix-by-matrix derivative, like this one. Basically, to compute this type of derivative, we need to take the differential first, and then vecterize it. It can be viewed as a combination of element-wise scalar derivatives, and can be interpreted using tensor algebra at a higher level seemingly.

What I'm curious about is whether there are specific, meaningful applications of this technique. I tried to search online but found nothing. There are some basic examples, but these meaningless examples are just used to illustrate how to compute the derivatives. For example, the derivate of $F=AX$ where $A$ and $X$ are matrices can be solved as

$$ \begin{gather*} dF = d(AX) = Ad(X)\\ \mathrm{vec}(dF) = \mathrm{vec}(Ad(X)) = (I_n\otimes A)\mathrm{vec}(dX) \end{gather*} $$

But this example doesn't seem to have realistic meaning, which is what I want to find. Could you give me some meaningful examples to illustrate the use and significance of matrix-by-matrix derivative please?

  • 3
    Hi, could you please elaborate on what you mean by "matrix-by-matrix" derivative? – NDewolf Nov 30 '24 at 09:24
  • 1
    @NDewolf You can refer to this post. Basically, it takes the scalar-by-scalar derivative for each element, and assembles them together in a certain way. But actually I don't know the formal mathematical definition. – silverxz Nov 30 '24 at 09:36
  • Actually, the problem is usually circumvented by working 1° component-wise, 2° within the vectorization formalism or 3° with matrix differentials. – Abezhiko Nov 30 '24 at 09:56
  • @Abezhiko Yeah, working component-wise is crucial. But if I want to take derivatives component-wise rather than element-wise, I’ll need to learn some advanced topics beyond basic linear algebra, such as the Fréchet derivative and possibly tensor products. Actually I don't have any knowledge about tensor at all. I’m just seeking particular examples or deeper insights to decide whether to learn this area. – silverxz Nov 30 '24 at 10:26
  • 1
    @silverxz You do not really need the Fréchet derivative in this case. Even tensor products are not really necessary. Componentwise derivatives of matrices are just derivatives from multivariable calculus – NDewolf Dec 02 '24 at 22:11

0 Answers0