Gradient Descent vs Lagrange Multipliers

Question

I'm bit confused between Gradient descent and convex optimization using Lagrange Multipliers. I know that we use Lagrange multipliers when we have an optimization problem with one or more constraints.

From the answer of this question, it seems that we can also use gradient descent for constrained optimization.

So what is the difference between those two approaches? Mathematically I know how both of the approaches work but I don't understand when and why one is preferred over another? For example, for optimization of SVM (Support Vector Machine) problem, we use Lagrange multipliers instead of gradient descent.

I've found one similar question here. But the answer is not much clear. Any intuitive explanation/example will help. Thanks.

Gradient descent does not handle constraints, but there is a variant of gradient descent called the projected gradient method that can sometimes handle constraints. In order for the projected gradient method to be useful, you must be able to project onto the feasible set efficiently. By the way, "use Lagrange multipliers" does not specify a unique optimization algorithm. There are many different optimization algorithms that compute Lagrange multipliers while also solving the primal problem. — littleO, Apr 09 '19 at 02:17
@littleO Any example algorithm which computes Lagrange multiplier for optimization? — Kaushal28, Apr 09 '19 at 05:02

score 2 · Answer 1 · answered Dec 09 '23 at 01:55

Gradient descent can be readily adapted to constrained problems through a technique called projected gradient descent. Mathematically, projection is possible onto any convex set, but in practice, computation can be challenging, and only a limited number of sets have a closed-form projection operator.

On the other hand, methods based on Lagrange multipliers rely on the fact that the constraint set is defined by a combination of equality and inequality constraints. Specifically, these inequalities are convex functions. While assuming access to such functions is a stronger assumption than merely knowing the constraint set, in practice, most convex constraints can be represented in the form of functional inequalities. Having access to these functions is advantageous as it provides additional insight into how close one is to the boundaries of the set.

score 0 · Answer 2 · answered Dec 06 '23 at 21:20

I know this is a bit late, but maybe it helps someone.

The method of Lagrange Multipliers gives you the analytic theory. With this method you can solve problems by hand (if they are simple enough), and reach the exact symbolic solution. It establishes a set of conditions that can be applied to any problem sistematically.
The Gradient Descent method is a numeric approach. You plug in an initial guess and iterate until the solution is close enough to a local minimum of a penalty function. There are many such penalty functions, and different penalty functions treat constraints differently, but they all "mimic" the Lagrangian function in some way.

Note: A good numerical solver will give you the Lagrange multipliers as well as the primal variables, such that you can check for yourself that the solution satisfies the theoretical optimality conditions (to some tolerance).

Gradient Descent vs Lagrange Multipliers

2 Answers2