3

The problem is to $\max_x f(x)$ given the constraint that $x \le C$.

I have two questions.

  1. When we have inequality constraints, how do we form the Lagrange function? Do we add $-\lambda$ or $+\lambda$ to our objective function, and do we multiply this $\lambda$ with $C - x$ or $x - C$?

  2. I am reading a book which says that at the optimum, the Lagrange multiplier is a measure of the "tightness" of the constraint. What exactly is meant by this, and how can I convince myself that it is true?

Sera Gunn
  • 27,981
Baoeo
  • 33
  • Strictly speaking, Lagrange mutiplies only apply to equality constraints. When dealing with inequalities, one might try to find extreme candidates in the "border" (where the inequality turns into an equality) , but one should also look for local extrema inside the permisible region (always assuming differentiability). See for example here https://math.stackexchange.com/questions/49473/lagrange-multipliers-with-inequality-constraints. Perhaps you could post some photo of the book page? – leonbloy Jun 07 '18 at 15:49

1 Answers1

1

With inequality constraints, one uses the Karush-Kuhn-Tucker (KKT) conditions which generalize the Lagrange conditions. There are a set of constraint qualifications which allow us to use the KKT conditions. In this case, your problem has linear constraints so the KKT conditions are accurate.

If the problem is to maximize $f(x)$ subject to $g_i(x) \le 0$ and $h_j(x) = 0$ then we form a multiplier system

$$ \nabla f(x) = \sum_i \mu_i \nabla g_i(x) + \sum_j \lambda_j \nabla h_j(x). $$

This is exactly the same as the Lagrange condition. The new feature is that for each inequality constraint, we demand that the multiplier be positive. That is that $\mu_i \ge 0$ for all $i$. And, of course, we also have the original inequalities and equalities: $g_i(x) \le 0$ and $h_j(x) = 0$

The KKT conditions have a complementary slackness feature. This says that at an optimum $x^*$, either the inequality $g_i(x^*) \le 0$ is tight, meaning that $g_i(x^*) = 0$ or $\mu_i = 0$. This is sometimes written as $\mu_i g_i(x^*) = 0$.

Complementary slackness means that if we are not on the part of the feasible region where the inequality $g_i(x) \le 0$ is preventing us from moving in one direction or another, then we can ignore the inequality all together.

In your problem, we have $g_1(x) = x - C \le 0$. So the KKT conditions give us the following system:

\begin{align*} x - C &\le 0 \\ f'(x) &= \mu_1 \\ \mu_1 &\ge 0. \end{align*}

(Since $g_1'(x) = 1$.) Complementary slackness says that either $x = C$ or $\mu_1 = 0$. That is, either the optimum is at $x = C$ with $f'(C) \ge 0$ or the optimum is with $x < C$ and $f'(x) = 0$ (the usual condition for optimality).

I believe, and I could be wrong, that "measure of tightness" refers to complementary slackness. The actual size of the multiplier doesn't say to much about how close $g_i(x) \le C$ is to being tight (i.e. to being equal).

Sera Gunn
  • 27,981
  • Actually, the book solves several such subproblems. Then they construct a $\overline{\lambda}$ as some sort of "average" of all the multiplier $\lambda$s from the subproblems. So, in that case, a really large $\overline{\lambda}$ would mean that a lot of the subproblems had $x = C$, so, I think what they meant was that a large $\overline{\lambda}$ means that the constraints are tight "on average"? – Baoeo Jun 07 '18 at 16:35
  • @Baoeo That sounds reasonable considering all the inequality multipliers are positive and are equal to zero when the corresponding inequalities aren't tight. – Sera Gunn Jun 07 '18 at 16:39
  • The i-th component of $\lambda$ tells you how much you can improve the optimal value by infinitesimally relaxing the constrained in the i-th component. See Boyd, Convex Optimization, 5.6 Perturbation and sensitivity analysis. – advancedchimp Jul 11 '18 at 13:22