The definition of the matrix-by-matrix derivative is:
$$ \frac{\partial X_{kl}}{\partial X_{ij}}=\delta_{ik}\delta_{lj} $$
If the matrices are $n\times n$, then the resulting matrix will be $n^2 \times n^2$.
Is the following identity valid for the matrix-by-matrix derivative?
$$ \frac{\partial}{\partial A} AB = \frac{\partial A}{\partial A} B + A\frac{\partial B}{\partial A} $$
If so, I do not understand how we can multiply a $n^2 \times n^2$ matrix by a $n \times n$ matrix?
$$ \frac{\partial}{\partial A} AB = \underbrace{\frac{\partial A}{\partial A}}_{n^2\times n^2} \overbrace{B}^{n\times n} + \overbrace{A}^{n\times n} \underbrace{\frac{\partial B}{\partial A}}_{n^2 \times n^2} $$