Given a sample $S = \{(x_i, y_i)\}^m_{i=1}$, the ERM rule for linear regression w.r.t. the squared loss is $$\widehat{w} \in \underset{\widehat{w}\in \mathbb{R}^d}{argmin}\lVert Xw-y\lVert$$ where $X$ is the design matrix of the linear regression with rows as samples ($m\times d$ matrix) and $y$ the vector of responses. Let $X = U\Sigma V^\top$ be the SVD of $X$, and the pseudoinverse of $X$ is defined by $X^\dagger = U\Sigma^{-1}V^\top$.
The problem is following : Recall that if $X^\top X$ is not invertible then there are many solutions. Show that $\widehat{w} = X^\dagger y$ is the solution whose $L2$ norm is minimal. That is, show that for any other solution $\bar{w}$, $\lVert \widehat{w}\lVert \leq \lVert \bar{w}\lVert$.
I want to know if solve this problem is equivalent to prove that every solution $\widehat{w}$ to $X^\top X \widehat{w} = X^\top y$ satisfies $$\lVert X\widehat{w} -y\lVert \leq \lVert X \bar{w} - y\lVert \quad \forall \bar{w}\in \mathbb{R}^d$$