5

enter image description here

In the proof above I find that the 2.101 is too sketchy to understand. And why can the $b_i$ relate to $a_i$ by $a_i=\frac{a_i}{\sum_{j=1}^nb_j}$ ? And $t_i=\frac{a_i}{b_i}$?

Please point me in the right direction. Thanks very much.

  • Defining $\alpha_i$ is the trick. nnotice $\alpha_1 + ... + \alpha_n = 1$. The sum $\sum b_j $ is just a positive number so it can be removed in (2.101) to obtain the inequality tyou wan – ILoveMath Oct 29 '16 at 13:12
  • @ILoveMath hope that I'm right here: $ \sum a_i log\frac{a_i}{b_i} \geq \sum a_i log \sum \frac{a_i}{\sum b_i} = \sum a_i log \frac{\sum a_i}{\sum b_i} = (\sum a_i) log \frac{\sum a_i}{\sum b_i}$ Thanks very much. – Lerner Zhang Nov 02 '16 at 00:09
  • Right! It's the same as that is proofed in Wikipeida. I confused $\alpha$ with $a$ – Lerner Zhang Nov 02 '16 at 04:40
  • Just replace $\alpha_i$ and $t_i$ to 2.100. – Lerner Zhang Nov 02 '16 at 04:43

2 Answers2

4

When questions like this arise, I find it helpful to try to re-write the inequality with equality, plus some remaining terms. That is, if you want to prove that $X \geq Y$, try to find an expression $X = Y + Z$ for some $Z$ and prove that $Z \geq 0$. That way, the steps in the proof become more apparent and self-motivating.

To see how this would work for the log-sum inequality, let $$ A = \sum_{i=1}^{n} a_{i}, B = \sum_{i=1}^{n}b_{i}, \tilde{a}_{i} = \frac{a_{i}}{A}, \tilde{b}_{i} = b_{i}/B. $$ Then, \begin{align*} \sum_{i=1}^{n} a_{i}\log\frac{a_{i}}{b_{i}} &= A \cdot \left( \sum_{i=1}^{n} \tilde{a}_{i} \log \left[ \frac{\tilde{a}_{i}}{\tilde{b}_{i}} \cdot \frac{A}{B} \right] \right) \\ &= A \cdot \left( \sum_{i=1}^{n} \tilde{a}_{i} \log \frac{\tilde{a}_{i}}{\tilde{b}_{i}} + \sum_{i=1}^{n} \tilde{a}_{i} \log \frac{A}{B} \right) \\ &= A \cdot \left( KL(\tilde{a} || \tilde{b}) + \log \frac{A}{B} \right) \\ &= A \cdot KL(\tilde{a} || \tilde{b}) + A \log \frac{A}{B}, \end{align*} where $KL(\tilde{a} || \tilde{b})$ denotes the KL-divergence between the probability distributions $(\tilde{a}_{i})_{i=1,\dots,n}$ and $(\tilde{b}_{i})_{i=1,\dots,n}$.

By convexity of $x \mapsto x \log x$, KL-divergence is always non-negative, and equal to zero if and only if $\tilde{a}_{i} = \tilde{b}_{i}$ for all $i = 1, \dots, n$. This completes the proof.

0

Since the given choices of $\alpha_i,\,t_i$ imply $\alpha_i\ge0$, $\sum_i\alpha_i=1$ and$$\sum_i\alpha_it_i=\sum_i\frac{b_i}{\sum_jb_j}\frac{a_i}{b_i}=\sum_i\frac{a_i}{\sum_jb_j},$$the right-hand side of (2.100) becomes that of (2.101). Similarly, the left-hand side is$$\sum_i\frac{b_i}{\sum_jb_j}\frac{a_i}{b_i}\log\frac{a_i}{b_i}=\sum_i\frac{a_i}{\sum_jb_j}\log\frac{a_i}{b_i},$$as claimed.

J.G.
  • 118,053