Kullback Leibler between two normal distributions - matrix proof explanation

Question

I was recently reading a proof of Kullback Leibler divergence between two multivariate normal distributions, and there are some steps that raised my concerns that I want to lighten.

Author of this proof denotes that $P \sim \mathcal N(\mu_p, \Sigma_p)$ and $Q \sim \mathcal N(\mu_q, \Sigma_q)$ both $k$ dimensional. Basing on this I have two questions (which are more related to linear algebra rather than probability):

$(1)$ Author states, that $(x - \mu_p)^T\Sigma_p^{-1}(x - \mu_p) \in \mathbb R$, but is this true actually? In my opinion it looks like this:

$(x - \mu_p)^T$ has dimension $\mathbb R^{n} \times \mathbb R^{k}$ (number of observations vs dimension)

$\Sigma_p^{-1}$ has dimension $\mathbb R^k \times \mathbb R^k$ (dimension vs dimension)

$(x - \mu_p)$ has dimension $\mathbb R^{k} \times \mathbb R^{n}$ (dimension vs obersvation).

Finally we end up with matrix $n \times n$ which is not exactly $1 \times 1$. What am I missing?

$(2)$ Author says that because $\text tr(ABC) = \text tr(CAB)$ we have that:

$$\text tr((x - \mu_p)^T\Sigma_p^{-1}(x - \mu_p) = \text tr((x - \mu_p)(x - \mu_p)^T\Sigma_p^{-1})$$

However, in my opinion is not that easy. Usually $\text tr(ABC) \neq \text tr(CAB)$, but it is really true, when each of them is symmetric. Indeed, if $A, B, C$ are symmetric then any permutation within the trace is always valid. However, in our example $(x - \mu_p)^T$ of course is not symmetric, because is not a squared matrix. What am I missing in this case?

Could you please explain to me those two facts?

score 2 · Accepted Answer · answered Jun 17 '22 at 17:25

2

(1) The $x$ here is one observation so it's just a vector.

(2) There is no constraint on the identity: $$\mathrm{tr}(ABC)=\mathrm{tr}(BCA)=\mathrm{tr}(CAB)$$ It is always true, regardless of the matrices $A$, $B$, and $C$. https://en.wikipedia.org/wiki/Trace_(linear_algebra)#Cyclic_property

answered Jun 17 '22 at 17:25

PC1

2,236
1
9
25

Concerning (2) But they say that "However, if products of three symmetric matrices are considered, any permutation is allowed..." But we do not consider three symmetric matricies – Lucian Jun 17 '22 at 17:32
This is a cyclic permutation, it is always true. You can check the link in the answer, which explains in more details. The symmetrical case is not at all of use here. – PC1 Jun 17 '22 at 17:33
So the only constraint on those matricies is that $ABC$ is a squared matrix and that's it? – Lucian Jun 17 '22 at 17:41
Do you have any reference in which I can see this theorem? Which is not wikipedia? – Lucian Jun 17 '22 at 17:43
@lucian https://math.stackexchange.com/a/252275/960197 – PC1 Jun 17 '22 at 23:53

Kullback Leibler between two normal distributions - matrix proof explanation

1 Answers1