I am following the book Mathematics for Machine Learning to study the math necessary to understand machine learning papers. On page 158, the authors list these results about gradients of scalar fields with respect to matrices and vectors:
$$ \frac{\partial \mathbf{a}^T\mathbf{x}}{\partial \mathbf{x}} = \mathbf{a}^T \tag{1} $$
$$ \frac{\partial \mathbf{a}^T\mathbf{X}\mathbf{b}}{\partial \mathbf{X}} = \mathbf{a}\mathbf{b}^T \tag{2} $$
The book follows numerator layout.
I am slightly confused about the dimension of the results. Assume $\mathbf{a} \in \Bbb R^n$, $\mathbf{X} \in \Bbb R^{n\times m}$, $\mathbf{b} \in \Bbb R^m$, where $m = 1$ in the first equation. Yet the dimension of the first one is $m\times n\ (m = 1)$ while the dimension of the second one is $n \times m$. Why is that?