How do you prove the following fact about the transpose of a product of matrices? Also can you give some intuition as to why it is so.
$(AB)^T = B^TA^T$
How do you prove the following fact about the transpose of a product of matrices? Also can you give some intuition as to why it is so.
$(AB)^T = B^TA^T$
Here's an alternative argument. The main importance of the transpose (and this in fact defines it) is the formula $$Ax\cdot y = x\cdot A^\top y.$$ (If $A$ is $m\times n$, then $x\in \Bbb R^n$, $y\in\Bbb R^m$, the left dot product is in $\Bbb R^m$ and the right dot product is in $\Bbb R^n$.)
Now note that $$(AB)x\cdot y = A(Bx)\cdot y = Bx\cdot A^\top y = x\cdot B^\top(A^\top y) = x\cdot (B^\top A^\top)y.$$ Thus, $(AB)^\top = B^\top A^\top$.
When you multiply $A$ and $B$, you are taking the dot product of each ROW of $A$ and each COLUMN of $B$.
The resulting dimension is $A_{\#col}\times B_{\#row}$, and after transposing, you have $B_{\#row}\times A_{\#col}$.
When you multiply $B^T$ and $A^T$, you take the dot product of each row of $B^T$ (column of B) and column of $A^T$, or row of $A$.
Your resulting dimension is $B^T_{\#col}\times A^T_{\#row}$ which is just $B_{\#row}\times A_{\#col}$
This formula ensures that each entry is correct, and that the dimensions are identical.
I marked this as community wiki since it so close to Saketh Malyala's answer.
For the intuition/background, please read this site answer.
We will now prove the assertion.
For any matrix $C$ let $\text{Row}(C,i)$ denote the $i^\text{th}$ row of $C$ represented in a natural way as vector.
For any matrix $C$ let $\text{Col}(C,j)$ denote the $j^\text{th}$ column of $C$ represented in a natural way as vector.
The $(i,j)^\text{th}$ entry of $AB$ is equal to $\langle \text{Row}(A,i), \text{Col}(B,j)\rangle$
The $(j,i)^\text{th}$ entry of $B^tA^t$ is equal to $\langle \text{Row}(B^t,j), \text{Col}(A^t,i)\rangle$
But $\text{Row}(B^t,j) = \text{Col}(B,j)$ and $\text{Col}(A^t,i) = \text{Row}(A,i)$, so indeed,
$$ {(AB)}^t = B^t A^t$$
If you know about dual spaces and maps, a conceptual proof can be obtained by observing that $A^T$ corresponds to the dual map of $A$ and that taking the dual is contravariant with respect to composition. That is, $(T \circ S)^* = S^* \circ T^*$.
Let $T : V \rightarrow W$ be a linear map and $(v_i)$ and $(w_i)$ be basis for $V$ and $W$ respectively. Let $A$ be the matrix for $T$ and $A'$ be the matrix for $T^*$. It is enough to show that $A_{ij} = A'_{ji}$.
Well $A_{ij} = w_i^(T(v_j))$ and similarly $A'_{ji} = v_j^{}(w_j^ \circ T)$ so it is enough to show that $v_j^{*}(w_j^ \circ T) =w_i^*(T(v_j))$. But this calculation is very simple.
– Aniruddh Agarwal Jun 02 '19 at 20:32