Does gradient descent converge to a minimum-norm solution in least-squares problems?
In this wonderful answer, the writer writes a proof that says to which value gradient descent will converge.
I'm trying to understand a simple detail.
It is implied that if $A = U\Sigma V^T$ and $y = V^Tx$, then $(I-A^TA)^kx = (I-\Sigma^T\Sigma)^ky$ and I am struggling more than I should be to understand why that is.
Shouldn't it be $ (I-V\Sigma^T\Sigma V^T)^kVy$?