Problems with Proof of Jensen's Inequality (Durrett's "Probability Theory and Examples")

Question

I have two questions concerning the proof of Jensen's inequality in Durrett's "Probability Theory and Examples" [pp.23-24]. In the following there is the proof, with the questions I have along the text.

[Note that in the following I substitute Durrett's notation for continuity from above $h \downarrow 0$, with $h \to 0^+$]

Theorem 1.5.1. Jensen’s inequality.
Suppose $\phi$ is convex, that is, $\lambda \phi (x) + (1 − \lambda) \phi(y) \geq \phi(\lambda x + (1 − \lambda)y)$ for all $\lambda \in (0,1)$ and $x,y \in \mathbb{R}$. If $\mu$ is a probability measure, and $f$ and $\phi(f)$ are integrable then $$ \phi \bigg( \int f d \mu \bigg) \leq \int \phi(f)d \mu.$$

Proof. Let $c=\int f d \mu$ and let $l(x)=ax+b$ be a linear function that has $l(c)= \phi(c)$ and $\phi(x) \geq l(x)$. To see that such a function exists, recall that convexity implies $$lim_{h \to 0^+} \frac{\phi(c)−\phi(c−h)}{h} \leq lim_{h \to 0^+} \frac{\phi(c+h)−\phi(c)}{h}$$ (The limits exist since the sequences are monotone.)

1. Is there somebody who can clarify what this limit (along with the convexity reference) really mean?

If we let $a$ be any number between the two limits and let $l(x) = a(x − c) + \phi(c)$, then $l$ has the desired properties. With the existence of $l$ established, the rest is easy. From the fact that if $g \leq f$ a.e., then $\int g d\mu \leq \int f d\mu$, we have $$ \int \phi(f ) d\mu \geq \int (af + b) d\mu = a \int f d\mu + b = l\bigg(\int f d\mu \bigg)= \phi\bigg(\int f d\mu \bigg)$$ since $c = \int f d \mu$ and $l(c) = φ(c)$.

2. Where does the first inequality $ \int \phi(f ) d\mu \geq \int (af + b) d\mu$ of the last formula come from?

Looking forward to any feedback.
Thank you for your time.

For the second question, it comes from the first line of the proof: "let $l(x)=ax+b$ be a linear function that has $l(c)= \phi(c)$ and $\phi(x) \geq l(x)$" and the fact that lebesgue integration has the monotonicity property — Brenton, Nov 16 '15 at 22:13
Thanks a lot for the reply! Forget the naive question but, Actually, how do we know that the result $\phi (x) \geq l(x)$ still holds with $\phi (f)$, namely $\phi \circ f$? — Kolmin, Nov 16 '15 at 22:40
For 1, notice that $(\phi(c+h)-\phi(c))/h \geq (\phi(c)-\phi(c-h))/h$ for all $h > 0$, because $\phi$ is convex. If you visualize the graph of a convex function $\phi$ this fact becomes very believable. — littleO, May 11 '17 at 05:48
For the first question, see this link, https://math.stackexchange.com/questions/2473951/how-can-we-prove-that-slopes-increase-in-a-convex-function-f-mathbbr-right — Fellow InstituteOfMathophile, Jan 12 '22 at 22:01

score 0 · Answer 1 · answered May 11 '17 at 05:26

1. The intuitive meaning is that the slope of a convex function $\phi (x)$ is increasing in $x$. More formally, the statement says that the limit of incremental ratio at a point $c$ (from the left) is smaller that the limit of the incremental ratio (from the right).

2. It is assumed that $\phi(x) \ge l(x) = a x+b$. Then replace $f(x)$ for $x$ to obtain $$\phi (f(x)) \ge l(f(x)) = af(x)+b$$ Integration preserves this inequality.

Problems with Proof of Jensen's Inequality (Durrett's "Probability Theory and Examples")

1 Answers1