0

Suppose $x$ is a random vector in $\mathbb{R}^n$ which is distributed according to $D$.

Assume $x_i$ is a sample.

What is $\sum_{i=1}^N x_ix_i^T$?

How can I relate this to covariance of data $C$?

Is $\sum_{i=1}^N x_ix_i^T = \alpha C +\beta$ for some $\alpha$ and $\beta$?

Having $(x_1,x_2,\cdots,x_N)$, how one can find $\mathbb{E}[(x-\mu)(x-\mu)^T]$?

My try to answer the above are as follows:

When we have access to the sequence of data we can build

$$ X'_N= \begin{bmatrix} x_1 & x_2 & \cdots & x_N \end{bmatrix} $$

Sample average is $\mu_N = \frac{\sum_{i=1}^N x_i}{N}$ so the normalized data matrix is

$$ X_N= \begin{bmatrix} x_1 -\mu_N & x_2- \mu_N & \cdots & x_N-\mu_N \end{bmatrix} $$ Therefore, $C_N$ is sample covariance matrix $$ C_N=X_NX_N^T= \begin{bmatrix} x_1 -\mu_N & x_2- \mu_N & \cdots & x_N-\mu_N \end{bmatrix} \begin{bmatrix} (x_1 -\mu_N)^T \\ (x_2- \mu_N)^T \\ \cdots \\ (x_N-\mu_N)^T \end{bmatrix} $$ $$ C_N= (x_1 -\mu_N)(x_1 -\mu_N)^T + (x_2- \mu_N)(x_2- \mu_N)^T + \cdots + (x_N-\mu_N)(x_N-\mu_N)^T $$

$$ C_N= \sum_{i=1}^N x_ix_i^T - (\sum_{i=1}^N x_i)\mu_N^T -\mu_N(\sum_{i=1}^N x_i)^T +N \mu_N\mu_N^T $$

Please answer my four questions separately.

  • If the $x_i$ are iid then the easiest thing to do is to replace $\mu_N$ by $\mu$ and evaluate the variance of $C_N-C$ using that the $(x_i-\mu)^\top(x_i-\mu)$ are iid. With $\mu_N$ the $(x_i-\mu_N)^\top(x_i-\mu_N)$ are not iid not even unbiased anymore, which leads to the unbiased estimator – reuns Dec 26 '18 at 00:10

1 Answers1

2

The sample covariance $C_N$ is $$C_N=\frac1N\sum (x_i-\mu_N) (x_i-\mu_N)^T.$$

From there, you can do as you did to prove $$C_N=\frac1N\sum x_i x_i^T -\frac1N \left(\sum x_i\right)\mu_N^T -\frac1N \mu_N \left(\sum x_i\right)^T+\mu_N \mu_N^T=$$ $$=\frac1N\sum x_i x_i^T-\mu_N \mu_N^T.$$

So you can find $\alpha$ and $\beta$ from here (although they will depend on the sample).

Finally, you can't "find" $E(x-\mu)(x-\mu)^T$ (the actual covariance matrix of $x$) in terms of the sample, since this is a parameter of the distribution of $x$, but it can be shown (through the law of large numbers) that under fairly general conditions, $C_N$ is a consistent estimation for it.

  • In this post it is $\frac{1}{N-1}$ to make it unbiased, should I need that, can you explain it? –  Dec 26 '18 at 05:06