6

Consider $Y=X\beta+\varepsilon$, where $X$ is n by p, $\beta$ is p by 1 and $\varepsilon$ is n by 1 with covariance matrix = var($\varepsilon$)=$\sigma^2 I$.

Give expression for the regression and error sums of squares, find their expected values, and show that they are independent.

My work: One has $SSE=Y^{T}Y-\hat{\beta}^{T}X^{T}Y=Y^{T}(I-X(X^{T}X)^{-1}X^{T})Y$, and $SSR=Y^{T}(X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J)Y$. For $SSE$, it is easy to get its distribution. But I have difficulty to get the distribution of $SSR$, I was trying to prove $X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J$ is idempotent. But it seems not easy for me.

Also I have difficulty to prove $(I-X(X^{T}X)^{-1}X^{T})(X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J)=0$. Can someone help me here?

81235
  • 1,306
  • 11
  • 23
  • 2
    Your question has a flaw. You did not say what distribution $\epsilon$ follows yet you are trying to find the law!! – Saty Sep 09 '15 at 03:56

3 Answers3

8

Anyway here is a series of results that'll help you understand what's going on (I am assuming Euclidean norm i.e. $\langle u,v\rangle=u'v$)

Definition: Let $V\subset \mathbb{R}^n$ be a sub-space. $P_{n\times n}$ is a projection onto $V$ if

1) $\forall x\in V$ we have $P.x=x$ and

2) $\forall x\in V^{\perp}$ we have $P.x=0$

Theorem 1: If $P_{n\times n}$ is the projection onto $V\subset \mathbb{R}^n$ $\iff$ $P$ is idempotent symmetric and $\mathscr{C}(P)=V$.

Theorem 2: Let $o_1,...,o_r$ be any orthonormal basis of $V$(of course with rank $r\leq n$). Then the projection matrix onto $V$ is $P=OO'$ where $O=[o_1,...,o_r]_{n\times r}$

Theorem 3: Suppose you are given a $n\times p$ matrix $X$ with rank $p$. Then $X(X'X)^{-1}X'$( say $P_X$) is the projection onto $\mathscr{C}(P)=\mathscr{C}(X)$. Also by the prior theorem if you can find $p$-many orthonormal basis (since columns of $X$ are independent you can always do Gram-Schmidt) then $P_X$ will also be equal to $OO'$ (since Projection onto a given space is unique).

Theorem 4: $Y\sim N_p(\mu,I)$ and $P$ be any projection matrix then $Y'PY \sim \chi^2_{\operatorname{rank}(P)}(\frac{1}{2}\mu'P\mu)$ i.e. with $d.f.=\operatorname{rank}(P)$ and non-centrality parameter $\frac{1}{2}\mu'P\mu$

Proof: It's not hard at all. Since $P$ is symmetric it can be written in Spectral Decomposition form i.e. $P=\Gamma D \Gamma'\implies Y'DY=Y\Gamma D \Gamma'Y=Z'DZ=\sum d_i.Z_i^2$

Fact: Under normality un-correlation $\iff$ independence.

Assume $\epsilon \sim N_n(0,\sigma^2I)$. Then $Y \sim N_n(X\beta,\sigma^2I)$. In this set-up $\mu=X\beta\in\mathscr{C}(X)$. Hence $SSE=Y'(I-P_X)Y \sim \chi^2_{n-p}(0)$

[Important: $I-P_X$ is the Orthogonal Projection of $\mathscr{C}(X)$ i.e. if you take a vector $v$ from $\mathscr{C}(X)$ (which is indeed of the form $X.u$) then $(I-P_X)v=0$. So the non-centrality parameter $\frac{1}{2}\mu'(I-P_X)\mu=\frac{1}{2}X\beta'(I-P_X) X\beta=0$.]

For SSR the matrix involved $P_X-\frac{J}{n}$ is not necessarily a projection matrix! But if you have an intercept term in your regression then it will be i.e. the model looks like $y_i=\beta_1+\sum x_{ij}.\beta_j$ or in other words $X$ has $1$ as one of it column i.e. $1\in \mathscr{C}(X)$. If we have so, $$P_X.\mathbb{1}=1\implies \left(P_X-\frac{J}{n}\right)\left(P_X-\frac{J}{n}\right)=\cdots=P_X-\frac{J}{n}$$ i.e. idempotent! Thus we can apply Theorem 4.

For independence of SSE and SSR use the fact that $(I-P_X)(P_X-\frac{J}{n})=0$

StubbornAtom
  • 17,932
Saty
  • 829
  • 5
  • 13
1

I have two proofs. Both use the fact that $SSE$ is independent of $\beta=\begin{bmatrix}\beta_0\\\beta_1\end{bmatrix}$. (The proof of this is rather technical, and it uses the fact that for a multivariate normal distribution, zero covariance implies independence.)

  • First proof: $SSR=\beta_1S_{xY}$, where $S_{xY}=\sum_i(x_i-\bar{x})(Y_i-\bar{Y})$.
  • Second proof: $SSR=\sum_i(\hat{Y}_i-\bar{Y})^2$, $\hat{Y}_i=\beta_0+\beta_1x_i$, and $\bar{Y}=\beta_0+\beta_1\bar{x}$.
ashpool
  • 7,408
0

To prove $(I-X(X^{T}X)^{-1}X^{T})(X(X^{T}X)^{-1}X^{T}-\frac{1}{n}J)=0$ , we only need to prove:

\begin{split} \mathbf{1_n(1_n)^T} = X(X^{T}X)^{-1}X^{T}\mathbf{1_n(1_n)^T} \end{split}

And $\mathbf{1_n}$ is the $n\times1$ matrix with all elements are $1$

Notice that $X = \begin{bmatrix} \mathbf{1}& X_1 \end{bmatrix}$ that is the elements of first column are all $1$.

Consider the generalized-inverse matrix $X^-$ satisfies that:

\begin{split} XX^-X=X \end{split}

And because the first column of $X$ is $\mathbf{1_n}$, so we have:

\begin{split} XX^-\mathbf{1_n}=\mathbf{1_n} \end{split}

which is very important!!!

Then replace $\mathbf{1_n}$ in $X(X^{T}X)^{-1}X^{T}\mathbf{1_n(1_n)^T}$ with $XX^-\mathbf{1_n}$ , then we have:

\begin{split} X(X^{T}X)^{-1}X^{T}\mathbf{1_n(1_n)^T}&=X(X^{T}X)^{-1}X^{T}XX^-\mathbf{1_n}\mathbf{(1_n)^T}\\ &= XX^{-}\mathbf{1_n(1_n)^T}\\ &=\mathbf{1_n(1_n)^T} \end{split}

So we complete our proof.

Actually we can find the inverse of $(X^{T}X)^{-1}$ by using $X = \begin{bmatrix} \mathbf{1}& X_1 \end{bmatrix}$ and we can get the same answer.

Gang men
  • 555
  • 2
  • 8