6

I found a nice proof of the linear regression formulas by using physical springs in Mark Levi's Mathematical Mechanic on page 43.

Linear Regression (The Best Fit) via Springs Imagine a collection of data points $(x_i, y_i)$ in the plane. We are asked to find the straight line $y = ax +b$ that best fits this set of data. What does “best” mean? To answer this, for each $x_i$ we think of $y = ax_i+b$ as the predicted value, while $y_i$ is the observed or measured value. The mismatch between these two values is $y_i − (ax_i + b)$, called the error (figure 3.4). “Best line” here means the line for which the sum of squares of the errors is minimal. The precise formulation of the problem of best fit, also called the problem of linear regression, follows.

Figure 3.4. Which line minimizes the sum of errors (3.1)?

Problem Given $N$ data points $(x_k , y_k)$ in the plane, find the straight line $y = ax + b$ which fits these data best in the sense of minimizing the sum of squares of errors $$ S(a, b) = \sum_{i=1}^N{(y_i-(ax_i+b))^2} \tag{3.1} $$

The unknowns in this problem are the slope $a$ and the intercept $b$ of the “best” straight line. The standard method to find the minimum of (3.1) is to set the partial derivatives with respect to $a$ and $b$ to zero. Here is a mechanical shortcut to the answer.

Solution The unknown straight line is to be imagined as a rigid rod (figure 3.5). Let us pass the rod through frictionless sleeves constrained to vertical lines $x = x_i$ by frictionless guides. Each sleeve is connected to a nail (hammered into a data point) by a zero-length spring. Let us take Hooke’s constant to equal, so that the potential energy of each spring is simply the square of its length. The sum (3.1) has now acquired a physical meaning of potential energy!

Figure 3.5

If the sum of squares is minimal, then the potential energy of our mechanical system is minimal, and consequently the rod is in equilibrium. The only forces the rod “feels” are the normal reactions $F_i$ from the sleeves; the sum of these forces vanishes, as does the sum of their torques relative to the point of intercept $A$: $$ \sum_{i=1}^NF_i = 0 \tag{3.2} $$ $$ \sum_{i=1}^Nd_iF_i = 0 \tag{3.2} $$ where $d_i$ is the distance from the intercept to the sleeve. Note that $d_i \cos{\alpha} = x_i$ . Now to get an expression for $F_i$ , consider the balance of forces upon the sleeve. The sleeve feels

(i) the reaction force $−F_i$ from the rod

(ii) the pull of the spring, $y_i − (ax_i +b)$, and

(iii) the reaction from the guide in the $X$ direction.

Only two of these forces have nonzero $Y$ components, and they are in balance: $F_i \cos{\alpha} = y_i − (ax_i + b)$. Using these expressions for $d_i$ and $F_i$ in (3.2) we obtain

$$ \sum{Y}-a\sum{X}-Nb = 0 \tag{3.3} $$ $$ \sum{XY-a\sum{x^2}-b\sum{X}} =0 \tag{3.3} $$

This is a system of two equations with two unknowns a and b which, when solved, produces the “best” slope and intercept.

Note that the same result (3.3) can be obtained directly by setting partial derivatives of the error in (3.1) to zero.

The images (referenced in the above paragraph) and some footers can be seen by following the second, third and first link respectively.

My question

Is this a valid mathematical proof or just a nice new insight? Is there any obvious logical fallacy here?

enter image description here

Andrews
  • 4,293
Agile_Eagle
  • 3,016

1 Answers1

3

Spring potential goes like distance squared and so minimizing the sum of the spring potentials is the same problem as minimizing the distance squared - i.e. "least squares" linear regression.

Hence it's valid, because it's the same problem.

GPhys
  • 1,570
  • 2
    so you mean to say that the author has basically repackaged a standard math problem and made it seem as if he is proving maths with physics? – Agile_Eagle Jul 24 '17 at 05:20
  • Yes, that is what I am saying. – GPhys Jul 24 '17 at 05:21
  • But still why should minimizing squares solve the problem, you need physics for that right? – Agile_Eagle Jul 24 '17 at 05:22
  • 3
    @PrashantGokhale Why least squares is used is used for problems is related to assumptions about the distribution of the errors of points in the vertical direction. If you are interested in this then it would make a good new question. Certainly it's not related to springs. You can do many other types of regressions that aren't least squares. Least squares is not the final word on linear regression. For a general overview of the problem of fitting a line to data see e.g. https://arxiv.org/abs/1008.4686 – GPhys Jul 24 '17 at 05:24