Prove that the system $$A^T A x = A^T b$$ always has a solution. The matrices and vectors are all real. The matrix $A$ is $m \times n$.
I think it makes sense intuitively but I can't prove it formally.
Prove that the system $$A^T A x = A^T b$$ always has a solution. The matrices and vectors are all real. The matrix $A$ is $m \times n$.
I think it makes sense intuitively but I can't prove it formally.
The matrix $A^TA$ need not be invertible. So what you need to prove is that $A^T b$ lies in the image $V= {\rm im\;} A^T A$.
Now, $A^T b\in V$ is equivalent to $V^\perp \subset (A^T b)^\perp$ so it suffices to show that the orthogonal complement of ${\rm im} A^T A$ is also orthogonal to $A^T b$.
So suppose that $z^T A^T A=0$ (i.e. the vector $z$ is orthogonal to the image). Then also $|z^T A^T|^2 = z^T A^T A z=0$ so $z^T A^T=0$. But then $z^T A^Tb=0$ as we had to show.
Let the SVD of $\mathrm A \in \mathrm R^{m \times n}$ be
$$\mathrm A = \mathrm U \Sigma \mathrm V^{\top} = \begin{bmatrix} \mathrm U_1 & \mathrm U_2\end{bmatrix} \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm V_1^{\top}\\ \mathrm V_2^{\top}\end{bmatrix}$$
where the zero matrices may be empty. The eigendecomposition of $\mathrm A^{\top} \mathrm A$ is, thus,
$$\mathrm A^{\top} \mathrm A = \mathrm V \Sigma^{\top} \mathrm U^{\top} \mathrm U \Sigma \mathrm V^{\top} = \mathrm V \Sigma^{\top} \Sigma \mathrm V^{\top} = \begin{bmatrix} \mathrm V_1 & \mathrm V_2\end{bmatrix} \begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm V_1^{\top}\\ \mathrm V_2^{\top}\end{bmatrix}$$
Hence, the normal equations $\mathrm A^{\top} \mathrm A \, \mathrm x = \mathrm A^{\top} \mathrm b$ can be written as follows
$$\mathrm V \begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \mathrm V^{\top} \mathrm x = \mathrm V \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \mathrm U^{\top} \mathrm b$$
Let $\mathrm y := \mathrm V^{\top} \mathrm x$. Left-multiplying by $\mathrm V^{\top}$,
$$\begin{bmatrix} \hat\Sigma^2 & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix}$$
and, thus,
$$\begin{bmatrix} \mathrm I_r & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \hat\Sigma^{-1} & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix} = \begin{bmatrix} \hat\Sigma^{-1}\mathrm U_1^{\top} \mathrm b\\ \mathrm O\end{bmatrix}$$
which always has a solution, as $\mathrm O \mathrm y_2 = \mathrm O$ always has a solution. The affine solution space is parametrized by
$$\mathrm x = \mathrm V_1 \hat\Sigma^{-1} \mathrm U_1^{\top} \mathrm b + \mathrm V_2 \eta$$
where $\{ \mathrm V_2 \eta \mid \eta \in \mathbb R^{n-r} \}$ is the null space of $\mathrm A$. If $\mathrm A$ has full column rank, then $r = n$, the null space of $\mathrm A$ is $\{0_n\}$, and $\mathrm V_1 \hat\Sigma^{-1} \mathrm U_1^{\top} \mathrm b$ is the only solution to the normal equations.
Addendum
Using the SVD of $\mathrm A$, the original system $\mathrm A \mathrm x = \mathrm b$ can be written as
$$\begin{bmatrix} \hat\Sigma & \mathrm O\\ \mathrm O & \mathrm O\end{bmatrix} \begin{bmatrix} \mathrm y_1\\ \mathrm y_2\end{bmatrix} = \begin{bmatrix} \mathrm U_1^{\top} \mathrm b\\ \mathrm U_2^{\top} \mathrm b\end{bmatrix}$$
which only has a solution if $\mathrm U_2^{\top} \mathrm b = \mathrm 0$, i.e., if $\mathrm b$ is orthogonal to the left null space of $\mathrm A$, i.e., if $\mathrm b$ is in the column space of $\mathrm A$.
Prop 1: $ker(A^TA)=ker(A)$.
proof: suppose $x\in ker(A^TA)$, then $A^TAx=0\implies x^TA^TAx=0\implies (Ax)^TAx=0\implies Ax=0\implies x\in ker(A)$
Also, If $x\in ker(A)$, then $Ax=0\implies A^TAx=0\implies x\in ker(A^TA) \square$.
Prop 2: $A^TA=(A^TA)^T$
Prop 3: $col(A^T)+ker(A)=\mathbb{R}^m$. Since they are orthogonal complements.
Lastly, to show $A^TAx=A^Tb$ has at least one solution for all $b$, it is equivalent to showing that $col(A^T)=col(A^TA)$.
$$\text{Prop 3: }col(A^T)+ker(A)=\mathbb{R}^m$$ $$\text{Prop 1: } col(A^T)+ker(A^TA)=\mathbb{R}^m$$ $$\text{Prop 3: applied to $A^TA$: }col((A^TA)^T)+ker(A^TA)=\mathbb{R}^m$$ $$\text{Prop 2: }col(A^TA)+ker(A^TA)=\mathbb{R}^m$$
Hence $col(A^TA)=col(A^T)\square$.
Sorry for bumping but I had a solution which I liked to share.
Using the fundamental theorem of linear algebra We can decompose our matrix $b$ into two matrices $b_1$ and $b_2$ such that $b_1$ is in the column space of $A$ and $b_2$ is in the null-space of $A^T$.
Now let $x^*$ be a solution to $Ax^* = b_1$ (It always has a solution since $b_1$ is in the column space of $A$. We claim that $x$ is also a solution to $A^TAx = A^Tb$.
$$A^T(Ax* - b) = A^T(b_1 - (b_1 + b_2)) = A^T(-b_2) = - A^T(b_2) = 0$$
The last equality being a consequence of $b_2$ being in the null-space of $A^T$.