8

Denote $f(x,y)=\sqrt{x\ln x+y\ln y-(x+y)\ln(\frac{x+y}2)}$.
Show that $f(x,y)+f(y,z)\ge f(x,z)$ for $x,y,z> 0$.

This is a question from a friend, which is a deep learning homework. It looks like some square roots of KL divergence, but it seems no help. Some other friends have tried to square it but dealing with the crossing terms like $xy\ln x \ln y$ makes it tough. Some friends and the asker himself try to take the derivative of $y$ and calculate the minimum but to no avail... Some students suggested that it can be written as an integral. I don't have any idea, so I ask here.

JetfiRex
  • 3,315

3 Answers3

9

Let $g:\mathbb R_{>0}^2\to\mathbb R_{>0}$ be any function satisfying $$g(s,t)g(u,v)\geq g(s,v)g(t,u)$$ if $u\geq s$ and $v\geq t$. For any region $R\subset\mathbb R_{>0}^2$, define $$\mu(R)=\int_R g(u,v)dudv.$$ Then, for any $0<x\leq y\leq z$, $$\mu\big([x,y]^2\big)\mu\big([y,z]^2\big)\geq \mu\big([x,y]\times [y,z]\big)^2,$$ since \begin{align*} \mu\big([x,y]^2\big)\mu\big([y,z]^2\big) &=\int_x^y\int_x^y\int_y^z\int_y^zg(s,t)g(u,v)\ du\ dv\ ds\ dt\\ &\geq \int_x^y\int_x^y\int_y^z\int_y^zg(s,v)g(u,t)\ du\ dv\ ds\ dt\\ &=\mu\big([x,y]\times [y,z]\big)^2, \end{align*} where we have used that $u\geq s$ and $v\geq t$ everywhere in the region of integration. Define $h(x,y)=\sqrt{\mu\big([x,y]^2\big)}$. Then, for $x\leq y\leq z$, \begin{align*} \big(h(x,y)+h(y,z)\big)^2 &=\mu\big([x,y]^2\big)+\mu\big([y,z]^2\big)+2\sqrt{\mu\big([x,y]^2\big)\mu\big([y,z]^2\big)}\\ &\geq \mu\big([x,y]^2\big)+\mu\big([y,z]^2\big)+2\mu\big([x,y]\times [y,z]\big)\\ &=\mu\big([x,z]\big)^2=h(x,z)^2. \end{align*} This means $h(x,y)+h(y,z)\geq h(x,z)$ for $x\leq y\leq z$. When $z\geq y\geq x$, the inequality is the same, since $h$ is symmetric, and when $y$ is not between $x$ and $z$ one of the terms on the left side exceeds the term on the right. This means that $h(x,y)+h(y,z)\geq h(x,z)$ always.


Now, define $g(u,v)=\frac1{u+v}$. We have \begin{align*} g(s,t)g(u,v)-g(s,v)g(t,u) &=\frac{(s+v)(t+u)-(s+t)(u+v)}{(s+t)(u+v)(s+v)(t+u)}\\ &=\frac{(u-s)(v-t)}{(s+t)(u+v)(s+v)(t+u)}\geq 0 \end{align*} if $u\geq s$ and $v\geq t$, and \begin{align*} h(x,y) &=\sqrt{\int_x^y\int_x^y\frac1{u+v}du\ dv}\\ &=\sqrt{2x\ln(2x)+2y\ln(2y)-2(x+y)\ln(x+y)}=f(2x,2y). \end{align*} So, $f(x,y)+f(y,z)\geq f(x,z)$ for all $x,y,z>0$, as desired.

  • Thank you very much for your fabulous answer! By the way, can you tell me what is the motivation of write $x\ln x+y\ln y-(x+y)\ln(x+y)$ as a double integral...? – JetfiRex Apr 14 '22 at 02:13
  • 2
    @JetfiRex The hope was, I guess, to relate the desired inequality to the triangle inequality in some metric space. Given the square roots, the "obvious" choice is the space of $L^2$-functions on something, which would necessitate writing $x\ln x+y\ln y-(x+y)\ln(\frac{x+y}2)$ as an integral of the square of some function. I couldn't quite get this to work, but then I noticed that the property $$x\ln x+y\ln y-(x+y)\ln(\frac{x+y}2)=t(x)+t(y)-2t\left(\frac{x+y}2\right)$$ for $t(x)=x\ln x$ could be combined with the fact that $t(x)$ has a nice second derivative to give the relevant double... – Carl Schildkraut Apr 14 '22 at 04:03
  • 1
    ...integral. This doesn't exactly give the clean "reduces to triangle inequality of $L^2$ functions" that I wanted (at least not as far as I can tell), but it ended up being enough to prove the inequality using a bit more technical manipulation. – Carl Schildkraut Apr 14 '22 at 04:04
  • Unless I am mistaken, $g(u,v)=\phi(u+v)$ satisfies the requirement on $g$ if $\phi$ is positive and $\log \phi$ is concave. That would allow to apply your proof to a wider class of functions. – Martin R Apr 14 '22 at 07:18
  • Can you please explain how "and when y is not between x and z one of the terms on the left side exceeds the term on the right." works? Also, does your approach in any way depend on the fact that the metric is defined on $(0, \infty)^2$? – ViktorStein Dec 21 '22 at 18:38
  • Also there is a typo: first you discuss the case $x \le y \le z$ and then the case $z \ge y \ge x$ (should be $z \le y \le x$ if I am not mistaken). – ViktorStein Dec 21 '22 at 18:54
6

Supplement to @Carl Schildkraut's very nice answer:

Remarks:

  1. Alternatively, we just prove that, for all $x, y > 0$, $$\int_0^\infty \left(\frac{\mathrm{e}^{-t x/2} - \mathrm{e}^{-ty/2}}{t}\right)^2\mathrm{d} t = x\ln x + y\ln y - (x + y)\ln \frac{x + y}{2}.$$ Then apply Minkowski's integral inequality.

  2. Alternatively, using the identity $\frac{1}{u + v} = \int_0^1 t^{u + v - 1} \mathrm{d} t$ (rather than $\frac{1}{u + v} = \int_0^\infty \mathrm{e}^{-t(u + v)}\mathrm{d} t$), we may turn to prove that $$\int_0^1 \left(\frac{t^x - t^y}{\sqrt{2t}\ln t}\right)^2 \mathrm{d} t = x\ln x + y\ln y - (x + y)\ln \frac{x + y}{2}.$$ Then apply Minkowski's integral inequality.

$\phantom{2}$


We have $$\int_{x/2}^{y/2} \int_{x/2}^{y/2} \frac{1}{u + v}\mathrm{d}u\mathrm{d} v = x\ln x + y\ln y - (x + y)\ln\frac{x + y}{2}.$$

Also, we have, for all $u, v > 0$, $$\int_0^\infty \mathrm{e}^{-t(u + v)} \mathrm{d} t = \frac{1}{u + v}.$$

We have \begin{align*} f(x, y) &= \int_{x/2}^{y/2} \int_{x/2}^{y/2} \frac{1}{u + v}\mathrm{d}u\mathrm{d} v \\ &= \int_{x/2}^{y/2} \int_{x/2}^{y/2} \int_0^\infty \mathrm{e}^{-t(u + v)} \mathrm{d} t \mathrm{d}u\mathrm{d} v\\ &= \int_0^\infty \left(\int_{x/2}^{y/2} \mathrm{e}^{-tu}\mathrm{d}u\right) \left(\int_{x/2}^{y/2} \mathrm{e}^{-tv}\mathrm{d}v\right) \mathrm{d} t \\ &= \int_0^\infty \left(\frac{\mathrm{e}^{-t x/2} - \mathrm{e}^{-ty/2}}{t}\right)^2\mathrm{d} t. \end{align*}

It suffices to prove that \begin{align*} &\sqrt{\int_0^\infty \left(\frac{\mathrm{e}^{-t x/2} - \mathrm{e}^{-ty/2}}{t}\right)^2\mathrm{d} t} + \sqrt{\int_0^\infty \left(\frac{\mathrm{e}^{-t y/2} - \mathrm{e}^{-tz/2}}{t}\right)^2\mathrm{d} t}\\ \ge\, & \sqrt{\int_0^\infty \left(\frac{\mathrm{e}^{-t x/2} - \mathrm{e}^{-tz/2}}{t}\right)^2\mathrm{d} t} \end{align*} which is true, using Minkowski's integral inequality ($p > 1$): $$\left(\int_a^b |f(x) + g(x)|^p \mathrm{d} x \right)^{1/p} \le \left(\int_a^b |f(x)|^p \mathrm{d} x\right)^{1/p} + \left(\int_a^b |g(x)|^p \mathrm{d}x\right)^{1/p}.$$

We are done.

River Li
  • 49,125
  • I think (see also my comment at Carl's answer) that $f(x, y) = \sqrt{h(x)+ h(x)-2h((x+y)/2)}$ satisfies the triangle inequality if $h$ is (strictly?) convex and $\log h''$ is concave. Here we have $h(x) = x \log x$. I wonder if your proof can be generalized in that direction. – Martin R Apr 14 '22 at 07:21
  • @MartinR Nice question. If $h = x\ln x$, then $h'' = 1/x$ and $\ln h'' = -\ln x$ is convex. Do you mean $h$ and $\ln h''$ are both convex? – River Li Apr 14 '22 at 08:55
  • 1
    Yes. The convexity of $h$ is needed to make ensure that the argument of the square root is non-negative. And if $\log g(x, y) = \log h''(x+y)$ is convex then – unless I made some error – Karamata's inequality shows that Carl's condition $g(s,t)g(u,v)\geq g(s,v)g(t,u)$ is satisfied for $s \le u$ and $t \le v$. – Martin R Apr 14 '22 at 09:18
  • 1
    @MartinR It is a nice result for which a rigorous proof is required. – River Li Apr 14 '22 at 09:42
  • @MartinR I tried to prove your idea (in the different, but very similar context of a question I asked last week): https://math.stackexchange.com/a/4603302/545914 – ViktorStein Dec 21 '22 at 18:19
  • @MartinR I just wanted to point out that $g(x, y) = h''(x + y)$ is a bit misleading, since $$\int_{x}^{y} \int_{x}^{y} h''(u + v) ; \text{d}u \text{d}v = h(2 x) + h(2 y) - 2 h(x + y)$$ (and not $h(x) + h(y) - 2 h\left(\frac{x + y}{2}\right)$). – ViktorStein Dec 21 '22 at 18:36
0

Some hint for another proof :

if we have for $a,b,c,x,p>0$ such that $\exists p\in(0,1/100)$ $$F(x)=g(x,abx/c)-g(x,xa)+g(abx/c,xa)$$

Then we have $x>0$ :

$$F''(x)\leq 0$$

Remark that all the function are concave with a plus so it remains to compare the only positive function in the second derivative of $F$ with another including a squared derivative.

Where :

$$g^2(x,y)=x\ln x+y\ln y+p-(x+y)\ln\frac{x+y}{2}$$ To conclude we can use the minimum of $F(x)$ and the three chord lemma as $F(0)=0$



Barackouda
  • 3,879