0

I've shown something similar for the 1-dimensional case. That $\Sigma x_n$ is the sufficient statistic of the gaussian mean and $\Sigma x_n$,$\Sigma x_n^2$ are the sufficient statistics of the gaussian variance.

However, im stuck on the multivariate example. How did we show that the log likelihood depends on the data only through $\Sigma x_n$ and $\Sigma x_n^Tx_n$?

The perceptron criterion is stated to be:

I suppose we only look at the exponent terms and i made the exponent into:

enter image description here

How to move the $x_n^T$ and $x_n$ together when the inverse variance is in between them?

Qwertford
  • 885

1 Answers1

1

Yes you look at the exponent.

$$\displaystyle\sum_{n=1}^N(X_n^T\Sigma^{-1}X_n-2\mu^T\Sigma^{-1}X_n+\mu^T\Sigma^{-1}\mu)=N\mu^T\Sigma^{-1}\mu-2\mu^T\Sigma^{-1}\sum_{n=1}^NX_n+\sum_{n=1}^NX_n^T\Sigma^{-1}X_n.$$

You need to deal with the quantity $X_n^T\Sigma^{-1}X_n,$ which is a scalar. We can write $X_n^T\Sigma^{-1}X_n=\text{Trace}(X_n^T\Sigma^{-1}X_n)=\text{Trace}(\Sigma^{-1}X_nX_n^T),$ where we use the fact that Trace of AB = Trace of BA

Moreover, trace is linear, $\text{Trace}(A+B)=\text{Trace}(A)+\text{Trace}(B).$ Properties of trace

Therefore, $\displaystyle\sum_{n=1}^NX_n^T\Sigma^{-1}X_n=\sum_{n=1}^N\text{Trace}(\Sigma^{-1}X_nX_n^T)=\text{Trace}\left(\sum_{n=1}^N\Sigma^{-1}X_nX_n^T\right)=\text{Trace}\left(\Sigma^{-1}\sum_{n=1}^NX_nX_n^T\right),$ which only depends on $\displaystyle\sum_{n=1}^NX_nX_n^T.$

Arnab Auddy
  • 1,111