I am working on the convergence analysis of gradient descent as well as stochastic gradient descent. I have come across the following papers which are great papers.
1- Lee et al. proving GD converges to local minimizer.
2- Jin et al. accelerated gradient descent for escaping saddle points
3- Jin et al. perturbed gradient descent converges to stationary points
4- Allen-Zhu et al. convergence of stochastic gradient descent
Question: Please discuss the recent improvements in this area after the aforementioned papers and cite the relevant papers that are new. Also, provide more materials on SGD since I have only been able to find few papers regarding convergence of SGD.