1

Jensen's inequality

Let $\phi : \mathbb{R} \rightarrow \mathbb{R}$ be a convex function and $X$ be a random variable. Then $$\phi(E[X]) \leq E[\phi(X)],$$ if $E[X]$ and $E[\phi(X)]$ exist.

Exercise Let $\phi : \mathbb{R} \rightarrow \mathbb{R}$ be a strictly convex function, that is, $$\phi(x) \geq ax+ b \forall x \in \mathbb{R} (1) $$

Then if $$\phi(E[X]) = E[\phi(X)]$$

$\Rightarrow X = c$ almost everywhere, where $c$ is c a constant.

Question 1

I have found that a function $f:X \rightarrow \mathbb{R}$ is called strictly convex iff $$f(tx_1 + (1-t)x_2) < tf(x_1) + (1-t)f(x_2) \forall t \in (0,1), \forall x_1,x_2 \in X$$

Why does the exercise mention $(1)$? Is it an equivalent definition?

Question 2

How should I approach this exercise?

Bernard
  • 179,256
Fib
  • 141

1 Answers1

3

Equation $(1)$ requires a little more context. A function $\phi:\mathbb R\to\mathbb R$ is convex if and only if, for all $x_0\in\mathbb R$, $\phi$ has a subderivative at $x_0$, i.e. there exists a number $c\in\mathbb R$ such that

$$ \phi(x) - \phi(x_0) \ge c(x-x_0) \qquad\qquad\qquad(*)$$

for all $x\in\mathbb R$. Moreover, $\phi$ is strictly convex if and only if equation $(*)$ is a strict inequality for $x\neq x_0$.

Translating this into the language of equation $(1)$, given $x_0\in\mathbb R$, there exists $a,b\in\mathbb R$ such that:

  • $\phi(x_0) = ax_0+b$, and
  • $\phi(x) > ax + b$ for all $x\neq x_0$.

(Specifically, one can take $a=c$ and $b=\phi(x_0)-cx_0$.) This can be taken as an equivalent definition of strict convexity.

To use this to solve your exercise, let $x_0=E[X]$. Then we have

$$\phi(X) \ge aX + b, \qquad\qquad\qquad(\dagger)$$

and so taking expectation, we find

$$E[\phi(X)] \ge aE[X] + b = ax_0 + b = \phi(x_0) = \phi(E[X]).$$

This proves Jensen's inequality. Moreover, since $\phi$ is strictly convex, we know that either $X=x_0$ almost surely, or the inequality in $(\dagger)$ is strict with positive probability. In the latter case, this of course implies that the inequality above is also strict, completing the proof.

Jason
  • 15,726
  • Thank you for the response. How do we prove that $X = x_0 $ almost surely? My question actually says that we have $X $ r.v. , $e^{E[X]} = E[e^{X}]$ and I want to show that $X=c$ a.s. So I figured I should prove that $\phi(x) = e^{x}$ is strictly convex and that for strictly convex functions we prove that $X=c$ a.s. But I am having trouble in both parts. – Fib Feb 06 '21 at 13:08
  • The second half of my post details how to prove $X$ is constant almost surely. – Jason Feb 07 '21 at 19:33
  • Yes I think I get it. Thank you! – Fib Feb 08 '21 at 10:58