Sorry in advance if this question sounds too broad or a little bit too obvious.
I know for sure that gradient descent, i.e., the update equation
$x_{k+1} = x_k - \epsilon_k \nabla f(x_k)$
converge to the unique minimizer of $f$ with domain $\text{dom}(f) = \mathbb{R}^n$ whenever $f$ is strictly or strongly convex.
However, I could not remember if it converges to a minimizer in convex functions, and how it achieves this convergence.
What is bothering me is that
I've seen some conflicting results where instead of $x_k$, an averaged sequence $\hat x_{k} = \frac{1}{K} \sum_k x_k$ converges.
I've also seen conflicting results where the step size is decreasing $o(1/k)$ vs it is constant.
There is also the issue of weak vs strong convergence. I'm not sure what this means exactly.
I have know some results but they are for quadratic functions, not for convex functions in general.
Can someone chime in on what this basic result in optimization look like?