11

Let $C$ be a convex set in $\mathbb{R}^n$ and let $f:{\mathbb{R}}^n \rightarrow \mathbb{R}$ be twice continuously differentiable over $C$.

The Hessian of $f$ is positive semidefinite over $C$, and I want to show that $f$ is therefore a convex function.

I am currently trying to apply Taylor's Theorem to replace $f(x)$ with an expression that includes its Hessian.

Satana
  • 1,259
  • Well, I suppose I haven't "tried" a proof. I'm trying to figure out what information I can gain about f or the domain of f simply based on the Hessian's values over a convex set. I'm pretty much stuck at the gates. – user178831 Sep 25 '14 at 21:38
  • Can you prove it's true in 1 dimension? Because if you restrict the function to a line it reduces to that case. – p.s. Sep 25 '14 at 22:15
  • @p.s.Thank you very much for your comment. I was able to prove it's true in 1 dimension, but I'm having difficulty understanding what you mean by "restrict[ing] the function to a line" and how I would communicate that. – user178831 Sep 25 '14 at 23:46

1 Answers1

15

I'm assuming what you would like to show is that if $f$ has positive semidefinite Hessian, then for all $\mathbf{x}, \mathbf{y}$ in the domain, and $t \in [0,1]$, we have: $$ f(t \mathbf{x} + (1-t) \mathbf{y}) \le t f(\mathbf{x}) + (1-t) f(\mathbf{y}) $$ To reduce it to the one-dimensional case, fix $\mathbf{x}$ and $\mathbf{y}$ and look at the function restricted to the line segment connecting those points. That is, define the one-dimensional function: $$g(t) = f(t \mathbf{x} + (1-t) \mathbf{y})$$ Then we can compute the derivatives of $g$: $$g'(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla}f(t \mathbf{x} + (1-t) \mathbf{y})$$ $$g''(t) = ( \mathbf{x} - \mathbf{y})^T \mathbf{\nabla^2}f(t \mathbf{x} + (1-t) \mathbf{y}) ( \mathbf{x} - \mathbf{y})$$ Since the Hessian is positive semidefinite, we have $g''(t) \ge 0$ for all $t$. Then we use this with Taylor's theorem to prove that: $$ \begin{aligned} g(0) &\ge g(t) + g'(t)(-t)\\ g(1) &\ge g(t) + g'(t)(1-t) \end{aligned} $$ Then if $t \in [0,1]$, these can then be combined to give: $$ g(t) \le tg(1) + (1-t)g(0) $$ which is equivalent to the inequality we wanted to prove.

p.s.
  • 6,724
  • 1
    I'm really struggling to prove that if g''(t) >= 0 then that inequality holds. Can I (or "would you") use Taylor's theorem to make that leap? – user178831 Sep 26 '14 at 19:46
  • The reason I asked if you knew how to do the 1-d case is that I had actually forgotten how to do it. =) But it's explained in this question. I expanded my answer a little. – p.s. Sep 26 '14 at 22:51
  • How come when computing $g'(t)$, we get an $(x-y)^T$ but when we compute a second derivative $g''(t)$, we get an $(x-y)$ instead of another $(x-y)^T$? – Yannik Sep 21 '17 at 16:15
  • If it helps, $\nabla f$ is the gradient, a vector. $\nabla^2 f$ is the Hessian, a matrix. Whereas $g$ and its derivatives are scalars. You could also write $g'(t)=\nabla f(..)^T (x-y)$ since for column vectors $u^Tv=v^Tu$. – p.s. Sep 21 '17 at 18:06
  • 1
    The third and the second from the last equations should have used a new variable $c \in (a, b)$, rather than messing up with $t$ which has other meaning in previous context. It's really a bad habit to reuse letters (especially without claiming) in the same proof. – hzh Feb 10 '19 at 08:56
  • Obviously the proof is wrong since the condition for both g(0)> g(t)+g'(t)(-t) and g(1)>= g(t)+g'(t)(1-t) to be correct is not satisified, one requires that t be close enough to 0 and the other requires that t be close enough to 1. – xjtein Mar 11 '21 at 12:50
  • Hello @p.s. !! Could you give also the structure of the proof for the other direction? We consider that $f$ is convex, that means that $f(tx+(1-t)y)\leq tf(x)+(1-t)f(y)$.

    We want to show that the Hessian matrix is positive semidefinite, i.e.that $\mathbf z^T H \mathbf z \ge 0$, for each $z$.

    For that do we assume that it is not true, i.e. that there is one $z$ that this inequality doesn't hold and we get a contradiction?

    – Mary Star May 25 '21 at 10:52
  • You need some more conditions than just convexity to prove the reverse - I think existence and continuity of the Hessian will do it. So, if the Hessian has a negative eigenvector at a point then you can use a similar argument to show that in along the direction of the eigenvector the function must be strictly concave in a neighboorhood, which would contradict convexity. Again you need to use continuity, so I'm not sure which form of Taylor's theorem would work though. – p.s. Jul 10 '21 at 21:22