3

Trying to understand this supposedly 'trivial' bound from a paper:

If $\theta_N$ denotes the vector $\theta$ with everything except $N$ largest coefficients set to $0$ then we have

$$ || \theta - \theta_N ||_2 \leq C_{2,p} \cdot ||\theta||_p \cdot (N+1)^{1/2 - 1/p} $$

for $N=0,1,2,...$ and constant $C_{2,p}$ depending only on $p \in (0,2)$.

user67081
  • 219
  • The very first thing which comes almost immediately is Hölder's inequality https://en.wikipedia.org/wiki/H%C3%B6lder%27s_inequality – rtybase Jul 26 '16 at 21:40
  • Just double-checking, is it correct that $p \in (0,2)$? Because $\frac{1}{2} - \frac{1}{p}<0$ – rtybase Jul 27 '16 at 18:04
  • 1
    @rtybase: by the generalized Holder inequality I would expect something like $$|\theta-\theta_N|_2\leq |\theta|_p\cdot(M-N)^{1/2-1/p}$$ where $M$ is the number of components of $\theta$. Where your $N+1$ comes from? – Jack D'Aurizio Jul 27 '16 at 19:29
  • $(M-N)^{\frac{1}{2}-\frac{1}{p}}\leq M^{\frac{1}{2}-\frac{1}{p}}=\frac{M^{\frac{1}{2}-\frac{1}{p}}}{(N+1)^{\frac{1}{2}-\frac{1}{p}}}\cdot (N+1)^{\frac{1}{2}-\frac{1}{p}}\leq M^{\frac{1}{2}-\frac{1}{p}} \cdot (N+1)^{\frac{1}{2}-\frac{1}{p}}$ and $M^{\frac{1}{2}-\frac{1}{p}}$ is just part of $C_{2,p}$. This is a way too trivial ... there must be a catch for 500 bounty – rtybase Jul 27 '16 at 20:39
  • Here is a related (imo) question http://math.stackexchange.com/questions/218046/relations-between-p-norms – rtybase Jul 27 '16 at 20:49
  • I am surprised to see no response in here! – msm Aug 01 '16 at 15:42
  • @msm I think the response is here, it's just too trivial, besides ... we need some answers from user67081 too ... – rtybase Aug 01 '16 at 17:25
  • Correct. As long as we don't care about the constant $C_{p,q}$ it may be trivial. In such case, a weaker inequality will be resulted: $\left | \boldsymbol{\theta}-\boldsymbol{\theta}_s \right |_q\le \frac{1}{s^{1/p-1/q}}\left|\boldsymbol{\theta}\right|_p$. However, I don't think my proof for the stronger one (with the constant) is trivial. – msm Aug 02 '16 at 00:06

1 Answers1

1

The proof is not trivial and I happen to see it a while ago. What follows is the proof for the general case with the error expressed in terms of the $\ell_q$-quasinorm of the difference vector assuming only $s$ largest elements of the vector $\boldsymbol{\theta}=(\theta_1,\theta_2,\cdots,\theta_M)$ if $\left\|\boldsymbol{\theta}\right\|_p\le1$.

Reorder $\boldsymbol{\theta}$ to get $\boldsymbol{\theta}^*=(\theta^*_1,\theta^*_2,\cdots,\theta^*_M)$ such that $\theta^*_1\ge\theta^*_2\ge\cdots\ge \theta^*_M$. Define $x_j\triangleq (\theta^*_j)^p$. We are going to prove that

$$\left \| \boldsymbol{\theta}-\boldsymbol{\theta}_s \right \|_q\le \frac{c_{p,q}}{s^{1/p-1/q}}\left\|\boldsymbol{\theta}\right\|_p$$

For simplicity assume the $q$-th power

$$\left \| \boldsymbol{\theta}-\boldsymbol{\theta}_s \right \|^q_q\le \frac{c^q_{p,q}}{s^{q/p-1}}\left\|\boldsymbol{\theta}\right\|^q_p$$ The first $s$ terms of the LHS are zero. Dividing both sides to $\left\|\boldsymbol{\theta}\right\|^q_p$ and given $x_1+x_2+\cdots+x_M\le1$, it is equivalent to prove that $$x^{q/p}_{s+1}+x^{q/p}_{s+2}+\cdots+x^{q/p}_{M}\le \frac{c^q_{p,q}}{s^{q/p-1}} \hspace{2cm}(1)$$ with $r=q/p>1$, we are looking for the maximum of the convex function $$f(x_1,x_2,\cdots,x_M)=x^r_{s+1}+x^r_{s+2}+\cdots+x^r_{M}$$ over a convex polygon $$\mathcal{C}=\{(x_1,\cdots,x_M)\in\mathbb{R}^N:\, x_1\ge \cdots \ge x_M \ge 0, \,x_1+x_2+\cdots+x_M\le1\}$$ The maximum is on one of the vertices of the polygon. So we check different vertices:

  • If $x_1=\cdots=x_M=0$ then $f=0$.

  • If $x_1=\cdots=x_k>x_{k+1}=\cdots=x_M=0$, for some $k$ such that $1<k<s$ then $f=0$.

  • If $x_1=\cdots=x_k>x_{k+1}=\cdots=x_M=0$, for some $k$ such that $s+1<k<N$, then $x_1=\cdots=x_k=1/k$ and $f(x_1,x_2,\cdots,x_M)=\frac{k-s}{k^r}$.

So we need to find the $k$ that maximizes $g(k)=\frac{k-s}{k^r}$.

It would be $k^*=\frac{rs}{r-1}$. Therefore, we get $$f\le g(k^*)=\frac{1}{r}\left(1-\frac{1}{r}\right)^{r-1}\frac{1}{s^{r-1}}=\frac{c^q_{p,q}}{s^{q/p-1}}$$ with $$c_{p,q}=\left[\left(\frac{p}{q}\right)^{p/q}\left(1-\frac{p}{q}\right)^{1-p/q}\right]^{1/p}$$ which yields $(1)$. Note that the question is not stated correctly and completely.

msm
  • 7,289