2

The normal equations of least squares regression $$X \beta = Y$$ yields the solution $\beta = (X X^T)^{-1} X^T Y$.

For simple (1-dimensional) regression, the solution to $\beta x_i + \alpha = y_i$ is given by $\beta = \frac{Cov(X, Y)}{Var(X)}$.

Is there a way to derive the second formula from the first? When $\alpha = 0$ and the mean of $X$ and $Y$ are $0$, this is obvious. But I don't know if there's a more general derivation.

1 Answers1

3

The equations $\beta x_i + \alpha = y_i$ for $i = 1,\ldots,n$ can be put into matrix form as follows:

$$\underbrace{\begin{bmatrix}1 & x_1 \\ 1 & x_2 \\ \vdots & \vdots \\ 1 & x_n \end{bmatrix}}_{X}\underbrace{\begin{bmatrix}\alpha \\ \beta \end{bmatrix}}_{\vec{\beta}} = \underbrace{\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n\end{bmatrix}}_{Y}$$

So, the normal equations are:

$$\underbrace{\begin{bmatrix}1 & 1 & \cdots & 1 \\ x_1 & x_2 & \cdots & x_n \end{bmatrix}}_{X^T}\underbrace{\begin{bmatrix}1 & x_1 \\ 1 & x_2 \\ \vdots & \vdots \\ 1 & x_n \end{bmatrix}}_{X}\underbrace{\begin{bmatrix}\alpha \\ \beta \end{bmatrix}}_{\vec{\beta}} = \underbrace{\begin{bmatrix}1 & 1 & \cdots & 1 \\ x_1 & x_2 & \cdots & x_n \end{bmatrix}}_{X^T}\underbrace{\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n\end{bmatrix}}_{Y}$$

which can be simplified as follows:

$$\underbrace{\begin{bmatrix}n & \sum_{i}x_i \\ \sum_{i}x_i & \sum_{i}x_i^2\end{bmatrix}}_{X^TX}\underbrace{\begin{bmatrix}\alpha \\ \beta \end{bmatrix}}_{\vec{\beta}} = \underbrace{\begin{bmatrix}\sum_{i}y_i \\ \sum_{i}x_iy_i \end{bmatrix}}_{X^TY}$$

Now, you have a simple $2 \times 2$ linear system. The solution is:

$$\begin{align}\begin{bmatrix}\alpha \\ \beta \end{bmatrix} &= \begin{bmatrix}n & \sum_i x_i \\ \sum_i x_i & \sum_i x_i^2 \end{bmatrix}^{-1}\begin{bmatrix}\sum_{i}y_i \\ \sum_{i}x_iy_i \end{bmatrix} \\ &= \dfrac{1}{n\sum_i x_i^2 - \left(\sum_i x_i\right)^2}\begin{bmatrix}\sum_i x_i^2 & -\sum_i x_i \\ -\sum_i x_i & n \end{bmatrix}\begin{bmatrix}\sum_{i}y_i \\ \sum_{i}x_iy_i \end{bmatrix} \\ &= \dfrac{1}{n\sum_i x_i^2 - \left(\sum_i x_i\right)^2}\begin{bmatrix}\left(\sum_i x_i^2\right)\left(\sum_i y_i\right) - \left(\sum_i x_i\right)\left(\sum_i x_iy_i\right) \\ -\left(\sum_i x_i\right)\left(\sum_i y_i\right)+n\left(\sum_i x_iy_i\right)\end{bmatrix} \\ &= \dfrac{1}{\frac{1}{n}\sum_i x_i^2 - \left(\frac{1}{n}\sum_i x_i\right)^2}\begin{bmatrix}\left(\frac{1}{n}\sum_i x_i^2\right)\left(\frac{1}{n}\sum_i y_i\right) - \left(\frac{1}{n}\sum_i x_i\right)\left(\frac{1}{n}\sum_i x_iy_i\right) \\ -\left(\frac{1}{n}\sum_i x_i\right)\left(\frac{1}{n}\sum_i y_i\right)+\left(\frac{1}{n}\sum_i x_iy_i\right)\end{bmatrix}\end{align}$$

Simplify this as needed.

JimmyK4542
  • 55,969