Why can I multiply some matrices with different sizes and some can't?

Question

So not only do we have $AB=BA$, the matrix products $AB$ and $BA$ can even be of different size.

Example 7: Let $C= \begin{pmatrix} 1 & 2 & 3 \\ 1 & 0 & 1 \\ -1 & 0 & -2 \end{pmatrix}$ and $D= \begin{pmatrix} -1\\-2\\-3 \end{pmatrix}$. Find (if possible) $CD$ and $DC$.

For this question, our lecturer just flew past this question and didn't explain much about why we can multiply $CD$ but not $DC$.

What is the exact reason why I can't multiply $DC$? Thanks!

For matrix multiplication to work, the dimensions need to be compatible. See, for instance, this answer. — Blue, Jul 29 '22 at 05:38
So long as the number of columns of the first matrix is the same as the number of rows of the second you can multiply them. If they are not you can not. — fleablood, Jul 29 '22 at 05:48
the way the product works is foreach row of the first matrix and each column of the second you multiply the $k$th terms together and keep a tally. That's only possible if each row of the first matrix and each column of the second has the same number of items. in other words if the number of columns of the first matrix is the same as the number or rows of the second. — fleablood, Jul 29 '22 at 05:52
If $A$ is a $m\times n$ matrix , for $AB$ to exist , we need that $B$ is a $n\times p$-matrix. $BA$ exists as well if and only if $m=p$. — Peter, Jul 29 '22 at 07:14

score 3 · Accepted Answer · answered Jul 29 '22 at 06:16

The number of columns of the first matrix needs to be equal to the number of rows of the second matrix. In this way:

The inner product between one row of the first matrix and one column of the second matrix is well defined
The outer product among columns of the first matrix and rows of the second matrix is also well defined.

Sidharth Ghoshal · Answer 2 · 2024-10-24T15:47:46.477

There's more to this than meets the eye.

Consider a product $QR$ with $Q = [2]$ and $R = \begin{bmatrix} 1 & -1 \\ 2 & -3 \end{bmatrix}$ so we are trying to evaluate:

$$[2] * \begin{bmatrix} 1 & -1 \\ 2 & -3 \end{bmatrix} $$

This product is equally as nonsensical as your original attempt to multiply a column vector to a matrix and yet this is well defined:

$$[2] * \begin{bmatrix} 1 & -1 \\ 2 & -3 \end{bmatrix} = \begin{bmatrix} 2 & 0 \\ 0 & 2\end{bmatrix} \begin{bmatrix} 1 & -1 \\ 2 & -3 \end{bmatrix} = \begin{bmatrix} 2 & -2 \\ 4 & -6 \end{bmatrix} $$

I.E. in order to multiply by a constant $c$ (which we consider a $1\times 1$ matrix) we consider $c\otimes I$ where $I$ is the identity matrix (and that $\otimes$ is now a tensor product, not a matrix multiplication) and then we can multiply. For those that aren't aware of the tensor product the following is pretty instructive. Given an $p \times q$ matrix $A$ and a $n \times n$ matrix $I_n$ the tensor product of the two matrices $T = A \otimes I_n$ is a $pn \times qn$ matrix where every element of $T$ consists of an entire copy of $A$ multiplied by the single corresponding constant element of $I_n$. Spelled out below:

$$ \underbrace{A}_{p \ \text{rows}, q \ \text{columns} } \otimes \underbrace{I_n}_{n \ \text{rows}, n \ \text{columns} } = A \otimes \begin{bmatrix} 1 & 0 & ... & 0 \\ 0 & 1 & ... & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1\end{bmatrix} = \underbrace{ \begin{bmatrix} A & 0 & ... & 0 \\ 0 & A & ... & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & A\end{bmatrix}}_{pn \ \text{rows}, qn \ \text{columns} }$$

This scheme naturally suggests the following.

Given an $n_A \times m_A$ matrix $A$ and a $n_B \times m_B$ matrix $B$ we can compute the least common multiple $\text{LCM}(m_A, n_B)$
We consider $A \otimes I_{\frac{\text{LCM}(m_A, n_B)}{m_A}} $ this is a $n_A \times \text{LCM}(m_A, n_B)$ matrix. And we consider $B \otimes I_{\frac{\text{LCM}(m_A, n_B)}{n_B}}$ this is a $\text{LCM}(m_A, n_B) \times m_B$ matrix.
Now the matrices are compatible i.e. the number of columns of $A \otimes I_{\frac{\text{LCM}(m_A, n_B)}{m_A}} = $ the number of rows of $B \otimes I_{\frac{\text{LCM}(m_A, n_B)}{n_B}}$ (both of which are $\text{LCM}(m_A, n_B)$) and therefore the matrix product:

$$ \left( A \otimes I_{\frac{\text{LCM}(m_A, n_B)}{m_A}} \right) \left( B \otimes I_{\frac{\text{LCM}(m_A, n_B)}{n_B}}\right) $$

Is well defined in a way that perfectly generalizes multiplying matrices by constants.

With this definition we can say the 'matrix product' product of an $n_a \times m_a $ matrix $A$ and a $n_b \times m_b$ matrix $B$ is a $n_a \frac{\text{LCM}(m_a, n_b)}{m_a} \times m_b \frac{\text{LCM}(m_a, n_b)}{n_b}$ matrix $AB$ defined as above and so we can now make some kind of sense out of multiplying your vector to a matrix. When $m_a = n_b$ this reduces to usual matrix multiplication, AND supports the abuse of notation "constants are $1\times 1$ matrices" so this feels like the correct viewpoint on these matters.

Why can I multiply some matrices with different sizes and some can't?

2 Answers2