0

I have this simple function $Z=f(Q)=AQ$ where $A$ is a $1\times n$ matrix and $Q$ is $n\times m$ matrix of variables I want to calculate derivative of $Z$ with respect to $Q$. (In the reference that I read the answer is $A^T$.

This equation is part of neural network back propagation algorithm and I really want to understand where this answer came from .

Perhaps my question is not formulated well because mathematics is not main interest

user3658307
  • 10,843

1 Answers1

0

Your question is somewhat ill-defined, because the derivative of a vector with respect to a matrix is (often) considered to be a 3D array. See here or here.

But it can be made sense of in the following way. Let $Z=AQ$ where $A\in \mathbb{R}^{1\times n}$, $Q\in \mathbb{R}^{n\times m}$, and so $Z\in \mathbb{R}^{1\times m}$. Then its components look like $$ \Lambda^k_{ij} = \frac{\partial Z_k}{\partial Q_{ij}} = \frac{\partial}{\partial Q_{ij}} \sum_\ell A_\ell Q_{\ell k} = \sum_\ell A_\ell\delta_{i\ell}\delta_{jk} $$ But now notice that all the components are zero except when $j=k$. Notice also that, once we assume $j=k$, there is no longer any dependence on $j$ or $k$. So let's just ignore those components and consider $$ \Psi_i= \Lambda^k_{ik} = \frac{\partial Z_k}{\partial Q_{ik}} = \sum_\ell A_\ell \delta_{i\ell} =A_i $$ If we let $\Psi\in\mathbb{R}^{n\times 1}$, then we can define $$ \Psi := \frac{\partial Z}{\partial Q} =A^T $$

user3658307
  • 10,843