2

Traditional orthogonal projection of a given point $y \in \mathbb{R}^n$ into a closed and convex set $D\in \mathbb{R}^n$ is defined as the follwing: $$ P_D(y)=\arg\min_{x \in D}||x-y||_2^2 $$ Now suppose one wants to find the orthogonal projection into a sparse subspace $C_s$ which is the set of all vectors in $\mathbb{R}^n$ that have at most $s$ nonzero entries. For example let $y=[1,2,1]^{\top}$ then we can have $x_1=[1,2,0]^{\top}$ or $x_2=[0,2,1]^{\top}$. Hence, we can define:

$$ P_{C_s}(y)\in\arg\min_{x \in C_s}||x-y||_2^2 $$

Question: Why $P_{C_s}(y)$ is the top $s$ elements of $|y|$ where $|y|$ is the element-wise absolute value?

My try: One can rewrite projection onto $C_s$ as a constraint on entries of $x$. That is, $C_s$ means having at most $s$ nonzeros now we can say we need $n-s$ zeros. So let the set of zero entries be $J$. $$ \min_{x \in C_s}||x-y||_2^2=\min_{J}||x-y||_2^2, \quad x_j=0, \quad \forall\,\,j=1,\dots,|J|, J\subseteq\{0,\dots,n\} $$ The Lagrangian is $L(x, \lambda)=||x-y||_2^2+\lambda^{\top}I_J^{\top}x$ where $I_J$ is a matrix whose columns are the identity matrix columns associated with $J$ and $\lambda \in \mathbb{R}^{|J|}$. KKT conditions are: $$ \nabla_x L(x, \lambda)=2(x-y)+I_J\lambda=0\\ \nabla_{\lambda} L(x, \lambda)=I_J^{\top}x=0 $$ By solving the above one gets $x^*=y-I_JI_J^{\top}y$ which says zero out $|J|$ entries of $y$ to get $x^*$.

Question: I cannot see how KKT conditions result in keeping $s$ largest values in absolute value sense.

Note: Although it makes sense to keep the large ones to decrease the norm, please do not provide answers without proofs.

Saeed
  • 145
  • The identity $\min\limits_{x\in C_s}|x-y|_2^2=\min\limits_J|x-y|_2^2$ seems dubious. Apart from this, do you accept a proof using elementary methods instead of KKT? – Ѕᴀᴀᴅ Jun 08 '21 at 01:33
  • @Saad: I made some changes hope is not dubious anymore. I know how to prove it using elementary methods, I am curious how it is done using KKT conditions. – Saeed Jun 08 '21 at 07:02

0 Answers0