LSS problem : $\widehat{w} = X^\dagger y$ minimize the $L2$ norm?

Question

Given a sample $S = \{(x_i, y_i)\}^m_{i=1}$, the ERM rule for linear regression w.r.t. the squared loss is $$\widehat{w} \in \underset{\widehat{w}\in \mathbb{R}^d}{argmin}\lVert Xw-y\lVert$$ where $X$ is the design matrix of the linear regression with rows as samples ($m\times d$ matrix) and $y$ the vector of responses. Let $X = U\Sigma V^\top$ be the SVD of $X$, and the pseudoinverse of $X$ is defined by $X^\dagger = U\Sigma^{-1}V^\top$.

The problem is following : Recall that if $X^\top X$ is not invertible then there are many solutions. Show that $\widehat{w} = X^\dagger y$ is the solution whose $L2$ norm is minimal. That is, show that for any other solution $\bar{w}$, $\lVert \widehat{w}\lVert \leq \lVert \bar{w}\lVert$.

I want to know if solve this problem is equivalent to prove that every solution $\widehat{w}$ to $X^\top X \widehat{w} = X^\top y$ satisfies $$\lVert X\widehat{w} -y\lVert \leq \lVert X \bar{w} - y\lVert \quad \forall \bar{w}\in \mathbb{R}^d$$

Thank you but I know how to prove that the second part is true. I just want to know if the first and second part are equivalent and why. — isaac, Apr 18 '23 at 20:00
They are not equivalent, because $X^TX$ can be singular. Obviously then you have infinitely many minimizers, but you're looking for the one with smallest norm. — V.S.e.H., Apr 18 '23 at 21:05
Does this answer your question? Moore-Penrose pseudoinverse solves the least squares problem (SVD framework) — Mittens, Apr 18 '23 at 22:09

LSS problem : $\widehat{w} = X^\dagger y$ minimize the $L2$ norm?

0 Answers0