25

I've read in a few places that if we have a Lipschitz gradient

$$\|\nabla f(x) - \nabla f(y)\|\leq L\|x-y\|,\, \forall x,y, $$ we can equivalently say $\nabla^2f\preceq LI.$ But I'm having a hard time showing this. (Equivalently, I want to show $z^T \nabla^2f(x)z\leq z^TLIz=Lz^Tz,\forall\, x,z $.)

Leland Stirner
  • 755
  • 1
  • 6
  • 16
  • @user147263 I have a question regarding his answer, When you use mean value theorem. You will get something like: $$|\nabla f(x)-\nabla f(y)|\le |\nabla^2 f | |x-y|$$ From this have can you judge whether $|\nabla^2 f | \le L$ or not? The equality in the cauchy schwarz may not be obtained? – Li haonan Mar 17 '18 at 21:44

3 Answers3

26

This is not true as stated. For example, the function $f(x)=x|x|$ on the real line has Lipschitz gradient, but is not twice differentiable. Also, the function $f(x)=-x^4$ satisfies $f''\le LI$ with $L=0$, but its gradient is not Lipschitz continuous.

The two properties are equivalent for functions that are convex and twice differentiable. For such functions, $\nabla^2 f$ is a positive semidefinite matrix, so its norm is its largest eigenvalue. Hence, $$\nabla^2 f \preceq LI \iff \|\nabla^2 f\|\le L \iff \|\nabla f(x)-\nabla f(y)\|\le L\|x-y\|$$ where the last equivalence is based on the mean value theorem.

  • 1
    Are you sure that the Lipschitz constant is preserved in the equivalence, or there is some factor depending on the dimension $n$ ($f:\mathbb R^n\to \mathbb R$) ? Thanks. – Svetoslav Mar 15 '16 at 19:09
  • 1
    It's preserved. The mean value theorem is applied on the line passing through $x,y$, so the problem becomes one-dimensional. –  Mar 15 '16 at 19:10
  • 1
    can you expand on how the the mean value theorem is applied here? – Sridhar Thiagarajan Aug 13 '18 at 08:46
  • @SridharThiagarajan The following link answers your question: https://math.stackexchange.com/questions/2294536/what-does-norm-on-gradient-bounded-of-f-imply-on-the-hessian-of-f?rq=1 – pikachuchameleon Jan 26 '19 at 15:56
  • 1
    Why using convexity? Even if $\nabla^2 f$ is indefinite, as long as it is symmetric (which is guaranteed if $f$ is twice Frechet differentiable or simply $C^2$), the norm is its largest eigenvalue. – William Apr 25 '22 at 21:28
4

Implication from gradient to Hessian holds true for a twice differentiable function. From the definition of the Hessian of a twice differentiable function $f(\mathbf{x})$, we know that for any vector $\mathbf{v}\in\mathcal{R}^n$

\begin{align} \nabla^2f(\mathbf{x})\mathbf{v}&=\lim_{h\to0}\frac{\nabla f(\mathbf{x}+h\mathbf{v})-\nabla f(\mathbf{x})}{h}\\ \implies ||\nabla^2f(\mathbf{x})\mathbf{v}||&\leq\lim_{h\to0}\frac{||\nabla f(\mathbf{x}+h\mathbf{v})-\nabla f(\mathbf{x})||}{|h|}\\ \implies ||\nabla^2f(\mathbf{x})\mathbf{v}||&\leq\lim_{h\to0}L\frac{|h|||\mathbf{v}||}{|h|}\\ \implies ||\nabla^2f(\mathbf{x})\mathbf{v}||&\leq L||\mathbf{v}|| \end{align}

Since this is true for any $\mathbf{v}$, it is also true for the eigenvectors for matrix $\nabla^2f(\mathbf{x})$. If $\mathbf{v}$ is such an eigenvector \begin{align} ||\nabla^2f(\mathbf{x})\mathbf{v}||&=||\lambda\mathbf{v}||\leq L ||\mathbf{v}||\\ \implies |\lambda|\leq L \end{align}

So all eigenvalues are upper bounded by $L$

-1

It is equivalent for $||\bigtriangledown ^2f(x)||_2 \leq L$ where $||\bigtriangledown ^2f(x)||_2$ means the maximum singular value.