Given a vector $z \in \mathbb{R}^n$ and $k < n$, finding the best $k$-sparse approximation to $z$ in terms of the Euclidean distance means solving $$\min_{\{x \in \mathbb{R}^n : ||x||_0 \le k\}} ||z - x||_2$$ This can easily be done by choosing $x$ such that it consists of the $k$ largest components of $z$ in terms of $| \cdot |$ and zero in every other component.
I was now thinking about what happens if we slightly modify the question to allow for other metrics on $\mathbb{R}^n$. For example, what would happen if we instead try to solve $$\min_{\{x \in \mathbb{R}^n : ||x||_0 \le k\}} (x-z)^TA(x-z)$$ for a symmetric positive definite matrix $A$? I guess this is much harder, but are there good algorithms that do this?