How can we gain a conceptual understanding of why $A^TAx=A^Tb$ gives us the least squares approx. for $x$ in Linear Algebra?

Question

Per the title, I am a visual learner and would appreciate some conceptual explanation as for why that equation works and is equivalent to $Ax = b$ for the least squares approximation.

If the answer could be related to linear transformations, even better.

One reason I ask is also because to solve an ordinary equation $Ax=b$, we can multiply both sides by the inverse of A, whereas for least squares, we multiply by A transpose. I understand the conceptual intuition of the former, but am hoping for one of the latter. Are the two connected perhaps?

This equation is very relevant for those hoping to apply Linear Algebra to statistics so a satisfying answer would be really appreciated.

Thanks!

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Jul 29 '23 at 16:01
Follow up question, since we can analytically solve equation Ax = b by multiplying both sides by A inverse, what's the relationship between A transpose and A inverse (since when we solve for least squares, we multiply both sides by A transpose)? — AviPraMar, Jul 29 '23 at 21:45

score 1 · Answer 1 · 2023-07-29T15:43:45.867

1

Here is how I think about it:

If $y=Ax$ is the element in the column space of $A$ that is closest to $b$, then the displacement vector $b-y$ must be perpendicular to all of $\text{Col}(A).$ This isn't too difficult to visualize with a quick sketch.

This means $$(b-y)\cdot Ac=0$$ for all vectors $c$. This is equivalent to saying $$c\cdot (A^Tb-A^TAx)=0$$ Since this is true for all vectors $c$ we must have $A^Tb-A^TAx=0$.

edited Jul 29 '23 at 15:43

answered Jul 29 '23 at 15:36

How did you get from the first equation to the second?
Also, follow up question, since we can analytically solve equation Ax = b by multiplying both sides by A inverse, what's the relationship between A transpose and A inverse (since when we solve for least squares, we multiply both sides by A transpose)?
– AviPraMar Jul 29 '23 at 21:44
1

@AviPraMar The point is that $A$ is not invertible – Andrew Jul 29 '23 at 22:01

How can we gain a conceptual understanding of why $A^TAx=A^Tb$ gives us the least squares approx. for $x$ in Linear Algebra?

1 Answers1