0

I was following a linear algebra course, and I came upon an example where a linear regression was done by solving $A^{\rm T}Ax = A^{\rm T}b$, where $Ax = b$ could not be solved because $b$ is not in the column space of $A$. So the formula $A^{\rm T}Ax = A^{\rm T}b$ is derived as the projection of $b$ onto the column space of $A$, and solved for as an approximation.

What I am confused about is where during this derivation this became a least squared approximation. How would I formulate the problem in a similar way (as a linear algebra problem), but minimize linear distance instead? (edited, from Euclidean distance which was improper use of terminology)

What I mean to ask is when fitting a line to a set of points, if I wanted to minimize the absolute error, not the squared error, how would I formulate the problem?

2 Answers2

2

Given an approximate solution $x$ we can define the squared error in the approximation as $E(x) = \| A x - b \|^2 = (Ax-b)^T(Ax-b)$.

Finding the $x$ that minimises this "error" us the least squares solution.

$E$ has gradient $\nabla E(x) = 2 A^T(Ax-b)$. Also $E$ has a minimum when $\nabla E = 0$. i.e. when $A^TAx-A^Tb = 0$ - which is exactly the condition you want.

1

You can think it through a geometric way. b is certainly not in the column space, but since the left-null space is the orthogonal component of the column space. We can divide b into two parts: b= b(column part) +b(left-null part). since the multiplication of the transpose of A and b(left-null part) is 0 , we can get Ax=b(column part). And b(column part) is the projection from b to the column space, which is closest to b

Steven Gan
  • 11
  • 1