4

$f(x) = \prod_{i=1}^n x_i^{\alpha_i}$

$x_i \geq 0, \alpha_i \geq 0, \sum_{i=1}^n \alpha_i = 1$

Prove that $f(x)$ is concave.

I tried to calculate $\nabla^2f(x)$ of the function and show that it is negative semi-defined. Let $\alpha = [\alpha_1, \dots, \alpha_n]$, $x = [x_1, \dots, x_n]$.

$$ \nabla f(x) = f(x)\frac{\alpha}{x}, \text{where} \frac{\alpha}{x} \text{ is a vector with element-wise divison}$$

$$ \nabla^2 f(x) = f(x) \left( \left(\frac{\alpha}{x} \right) \left( \frac{\alpha}{x} \right)^T - diag(\frac{\alpha}{x^2}) \right), \text{where } diag(x) \text{ is diagonal matrix with vector } x \text{ on its diagonal}$$

How to show that this hessian is negative semi-defined? Or maybe there is other more simple way to prove that $f(x)$ is concave

  • A good way to show that something is positive or negative (semi) definite is https://en.wikipedia.org/wiki/Sylvester%27s_criterion. I imagine you could get what you need here with an inductive argument. – Aaron Aug 21 '22 at 02:16

3 Answers3

3

It is equivalent to showing that $y^T (\begin{bmatrix} {\alpha_1 \over x_1} \\ \vdots \\{\alpha_n \over x_n} \end{bmatrix} \begin{bmatrix} {\alpha_1 \over x_1} & \cdots &{\alpha_n \over x_n} \end{bmatrix} - \operatorname{diag} ( {\alpha_1 \over x_1^2} \cdots {\alpha_n \over x_n^2}))y \le 0$ for all $y$, or $(\sum_k {\alpha_k \over x_k}y_k)^2 \le \sum_k {\alpha_k \over x_k^2}y_k^2$.

Cauchy Schwartz gives $(\sum_k {\alpha_k \over x_k}y_k)^2 = (\sum_k {\alpha_k \over x_k\sqrt{\alpha_k}}y_k \cdot \sqrt{\alpha_k})^2 \le (\sum_k {\alpha_k \over x_k^2}y_k^2) (\sum_k \alpha_k)= \sum_k {\alpha_k \over x_k^2}y_k^2$, as desired.

copper.hat
  • 178,207
1

Instead of computing the Hessian, we can use directly the definition and show that $\lambda f(x) + (1 - \lambda)f(y) \le f(\lambda x + (1 - \lambda)y)$ whenever $x > 0, y > 0$ and $\lambda\in(0,1)$.

Recall the weighted AM-GM inequality: $\alpha_1 a_1 + \dots \alpha_n a_n \ge a_1^{\alpha_1}\dots a_n^{\alpha_n}$. Applying this with $a_i = \frac{x_i}{\lambda x_i + (1 - \lambda)y_i}$, then with $a_i = \frac{y_i}{\lambda x_i + (1 - \lambda)y_i}$, we get: \begin{align} \frac{f(x)}{f(\lambda x + (1 - \lambda)y)} = \prod_{i=1}^n \left(\frac{x_i}{\lambda x_i + (1 - \lambda)y_i} \right)^{\alpha_i} &\le \sum_{i=1}^n \left(\frac{\alpha_i x_i}{\lambda x_i + (1 - \lambda)y_i} \right),\\ \frac{f(y)}{f(\lambda x + (1 - \lambda)y)} = \prod_{i=1}^n \left(\frac{y_i}{\lambda x_i + (1 - \lambda)y_i} \right)^{\alpha_i} &\le \sum_{i=1}^n \left(\frac{\alpha_i y_i}{\lambda x_i + (1 - \lambda)y_i} \right). \end{align} Multiplying the first and the second by $\lambda$ and $(1 - \lambda)$, respectively, then summing up the two we obtain \begin{equation} \frac{\lambda f(x) + (1 - \lambda)f(y)}{f(\lambda x + (1 - \lambda)y)} \le 1. \end{equation}

Source: https://math.stackexchange.com/a/3839952/31498

f10w
  • 4,709
0

For fixed $\alpha_i$ so that $\sum\limits_{i=1}^n\alpha_i=1$, and $z_i\gt0$, consider the quantity $$ \prod_{i=1}^n(1+z_i)^{\alpha_i}-\prod_{i=1}^nz_i^{\alpha_i}\tag1 $$ Without loss of generality, assume that $z_i\lt z_{i+1}$. If $z_i=z_{i+1}$, just add $\alpha_{i+1}$ to $\alpha_i$ then remove $z_{i+1}$ and $\alpha_{i+1}$.

Let $$ \lambda=\prod_{i=1}^n\left(\frac{z_i}{1+z_i}\right)^{\alpha_i}\tag2 $$ Since $\lambda$ is a geometric mean of the $\frac{z_i}{1+z_i}$, we have $\frac{z_1}{1+z_1}\le\lambda\le\frac{z_n}{1+z_n}$.

Taking the partial derivative of $(1)$ yields $$ \begin{align} \frac{\partial}{\partial z_k}\left(\prod_{i=1}^n(1+z_i)^{\alpha_i}-\prod_{i=1}^nz_i^{\alpha_i}\right) &=\prod_{i=1}^n(1+z_i)^{\alpha_i}\frac{\alpha_k}{1+z_k}-\prod_{i=1}^nz_i^{\alpha_i}\frac{\alpha_k}{z_k}\tag{3a}\\ &=\prod_{i=1}^n(1+z_i)^{\alpha_i}\left(\frac{\alpha_k}{1+z_k}-\lambda\frac{\alpha_k}{z_k}\right)\tag{3b}\\ &=\prod_{i=1}^n(1+z_i)^{\alpha_i}\left(\frac{z_k}{1+z_k}-\lambda\right)\frac{\alpha_k}{z_k}\tag{3c} \end{align} $$ Since $\frac{z_n}{1+z_n}\ge\lambda$, $(3)$ says that $$ \frac{\partial}{\partial z_n}\left(\prod_{i=1}^n(1+z_i)^{\alpha_i}-\prod_{i=1}^nz_i^{\alpha_i}\right)\ge0\tag4 $$ $(4)$ implies that $(1)$ decreases as we decrease $z_n$ to $z_{n-1}$. Then combining $z_n$ with $z_{n-1}$, as outlined after $(1)$, we get $$ \prod_{i=1}^n(1+z_i)^{\alpha_i}-\prod_{i=1}^nz_i^{\alpha_i}\ge \prod_{i=1}^{n-1}(1+z_i)^{\beta_i}-\prod_{i=1}^{n-1}z_i^{\beta_i}\tag5 $$ where $\beta_i=\alpha_i$ for $i\lt n-1$ and $\beta_{n-1}=\alpha_{n-1}+\alpha_n$.

Iterating $(5)$, we get $$ \begin{align} \prod_{i=1}^n(1+z_i)^{\alpha_i}-\prod_{i=1}^nz_i^{\alpha_i} &\ge(1+z_1)-z_1\tag{6a}\\ &=1\tag{6b} \end{align} $$


Set $z_i=\frac{y_i}{x_i}$ and multiply $(6)$ by $\frac12\prod\limits_{i=1}^nx_i^{\alpha_i}$ to get $$ \frac12\left(\prod_{i=1}^nx_i^{\alpha_i}+\prod_{i=1}^ny_i^{\alpha_i}\right)\le\prod_{i=1}^n\left(\frac{x_i+y_i}2\right)^{\alpha_i}\tag7 $$ which, since $f$ is continuous, says that $$ f(x)=\prod_{i=1}^nx_i^{\alpha_i}\tag8 $$ is concave.

robjohn
  • 353,833