5

Let $A$ be a real $m \times n$ matrix. The Lasso optimization problem is $$ \text{minimize} \quad \frac12 \| Ax - b \|_2^2 + \lambda \| x \|_1 $$ The optimization variable is $x \in \mathbb R^n$.

The $\ell_1$-norm regularization term encourages $x$ to be sparse, so Lasso is useful for finding a sparse vector $x$ that satisfies $Ax \approx b$. The parameter $\lambda > 0$ controls how sparse the solution to the Lasso problem is.

Question: Suppose I know that I would like for the solution to the Lasso problem to have exactly $p$ nonzero entries. Are there any techniques or tricks or heuristics for choosing a value of $\lambda$ such that the solution to the Lasso problem has exactly (or at least approximately) $p$ nonzero entries?

littleO
  • 54,048
  • I can't think of a one-shot procedure but you can definitely do some sort of homotopy or $\lambda$-path-following method. I've implemented that before, it works pretty well. For instance, once you've solved the problem for a given fixed value of $\lambda$, you can very easily compute the next smaller value of $\lambda$ below which a new variable becomes nonzero. – Michael Grant Oct 07 '17 at 01:34
  • Or heck, skip the precise breakpoint calculation and combine bisection on $p(\lambda)$ with warm start. – Michael Grant Oct 07 '17 at 01:37
  • Given that Least Angle Regression provides a very efficient algorithm for computing the entire LASSO path, perhaps you could just use that and extract the information directly from it. e.g. see "Elements of Statistical Learning", Hastie and Tibshirani, Chapter 3 (sec 3.4.4) – Glen_b Oct 10 '17 at 00:08
  • Not sure if this is a silly question or not, but: Why not just take the $p$ largest-magnitude entries after solving the problem? Is there a result saying a pure $p$-sparse LASSO solution is "better" (in some sense) than a LASSO solution which was truncated to be $p$-sparse? – Zim Jul 27 '20 at 22:17
  • Also I think there are some classical results on this saying "If you know that $b$ arose from a $p$-sparse solution, then under some properties on $A$ (e.g. Restricted isometry and its variants), the LASSO solution is exactly the $p$-sparse solution." However, 1) I'm not too familiar if those properties constructively inform you how to choose $\lambda$, and 2) this question is more general: it does not suppose the existence of a $p$-sparse "original signal" which yielded $b$. – Zim Jul 27 '20 at 22:25

0 Answers0