8

Let $x_0 \in \mathbb{R}^n$ and $A \in \mathbb{R}^{n \times n}$ be given. Define the set $S$ as $$ S \triangleq \{x \in \mathbb{R}^n: A x \leq 1\}. $$

I want to compute the projection of $x_0$ onto $S$, i.e. the closest point to $x_0$ in $S$. Is there any efficient way to do this? Of course, this can be formulated as a quadratic problem but if I want to use gradient descent to solve this I need the projections step as a subroutine. As of now I am trying to derive a closed form expression but that seems futile.

2 Answers2

10

As mentioned in other answers and comments, the problem you need to solve is an inequality constrained, strongly convex QP:

$$ \begin{aligned} \mathrm{minimize}\ &\tfrac{1}{2}\|x\|^2_2 \\ \mathrm{subject\ to}\ & Ax \leq b. \end{aligned} $$

There's little chance you can use some direct method here (this possibility is discussed here), so you should resort to iterative methods (and if you need this operation into an outer iterative routine, you will obtain a doubly-iterative algorithm).

There are many iterative methods that you can use to solve this type of problems. Many of them will rely on a projection: note that this will be as simple as projecting onto the nonnegative orthant (which you can do in closed form) and not onto the polyhedron itself (which of course is the ultimate goal).

The fast proximal gradient method is arguably the simplest option. In this case you would apply such type of algorithms to the dual of your original problem, which reads

$$ \begin{aligned} \mathrm{minimize}\ &\tfrac{1}{2}\|A^\top y\|^2_2 + \langle b,y \rangle \\ \mathrm{subject\ to}\ & y \geq 0 \end{aligned} $$

(check that all signs in the above problem are correct). In fact, (fast) dual PGM in this case is precisely the (fast) dual projected gradient method. This is the simplest method I can see: only matrix-vector products by $A$ and $A^\top$ are to be computed, apart from trivial projections onto the nonnegative orthant and vector sums.

Alternatively you can apply the Douglas-Rachford splitting algorithm to the dual, which is the alternating direction method of multipliers (ADMM). In this case you would probably require solving linear systems along the iterations, either with a direct method or with some iterative procedure (like Krylov subspace methods), depending on the shape/structure of $A$.

Furthermore, active-set and interior-point methods are also options, depending on the size of the problems you're handling (see also the top answer to this similar question for this).

There's really no ultimate answer: which approach is the best fit always depends on the particular structure of the problem, and the accuracy of the solution that you need. However, the methods I suggested above are general

  • Sorry, I am a bit confused about why we can just solve the dual. Is it because we can solve the dual and then note that with strong duality, the optimum value for the dual problem is equal to the optimum of the primal problem, and then we can just solve for the 2 different $\textbf{x}$ whose squared norm gives that value? I think this makes sense, but I am not sure. – Minecraft dirt block Dec 29 '23 at 19:43
5

This is a quadratic programming problem: minimize $(x - x_0)^T (x - x_0)$ subject to $A x \le 1$. The quadratic form is positive definite.

EDIT: You might also note that any quadratic programming problem whose quadratic form is positive definite is equivalent to your problem by a change of variables (diagonalizing the quadratic form). So there's really nothing special here.

Efficient algorithms for solving quadratic programming problems with positive definite quadratic forms are well known: see the Wikipedia link. Software such as Maple and Cplex have implementations of such algorithms.

Robert Israel
  • 470,583
  • My goal is to eliminate those constraints. How can I solve the above quadratic problem with the constraints, say using gradient descent? – pikachuchameleon Mar 30 '17 at 16:07
  • 1
    You can't. You don't get to pick which algorithm to use without taking the model into consideration. Robert has given you the best answer for the given model. – Michael Grant Mar 30 '17 at 21:29