1

In this thread about the derivation of lagrange multipliers as well as the wikipedia article the Lagrangian is defined as

$ F(\vec{x}, \vec{\lambda}) = f(\vec{x}) - \langle \vec{\lambda}, \vec{g(x)} \rangle $.

I thought that $\lambda$ had to be a scalar in order to constrain $ \nabla g $ and $ \nabla f $ to be collinear.

How can I interpret the formulation in which $\lambda$ is a vector?

mihir00
  • 15
  • 1
    Well, if you have a handful of constraints, you have a handful of scalars as Lagrange multipliers instead of just one -- or a vector from $\mathbb R^{\text{handful}}$ – Hagen von Eitzen Feb 18 '24 at 16:02

1 Answers1

0

If you are trying to optimize a function $f$ subject to constrains $g_1=0,\ldots,g_k =0$, then the key point of the theory of Lagrange multipliers is that, at a (constrained) local extremum $x_0$, the gradient $\nabla(f)$ will be normal to the domain given by the constraints. When there are multiple constraints, this means that $\nabla(f)(x_0)$ will be a linear combination of the gradients $\nabla(g_1)(x_0),\ldots,\nabla(g_k)(x_0)$, that is, there will be $\vec{\lambda} \in \mathbb R^k$ such that

$$ \nabla(f)(x_0) = \sum_{i=1}^k \lambda_i \nabla(g_i)(x_0) $$

which translates into a "Lagrangian" $$ F(\vec{x},\vec{\lambda}) = f(\vec{x}) - \langle \vec{\lambda},\vec{g}(x)\rangle, $$ where $\vec{g} = (g_1,\ldots,g_k)$. Thus the Lagrange multipliers form a vector in $\mathbb R^k$ where $k$ is given by the number of constraining functions (so in particular, while it is a vector, it is not a vector in the same space as the vector $\vec{x}$).

krm2233
  • 7,230
  • Thanks! Some clarifying logic I'd like to further confirm: (1) The gradient of a function is orthogonal to the level curves of that function. (2) The single constraint condition $\nabla f(x) = \lambda \nabla g(x), g(x)=0$ enforces that $x$ meets the constraint and the gradient of $f$ is orthogonal to that constraint. (3) Every constraint is effectively ANDed into a new constraint because we require $g_i(x) = 0 \forall i$. (4) The summation of all $\lambda_i \nabla g_i(x)$ is the correct way to state the orthogonal complement of a $g(x)$ that contains multiple $g_i$. – mihir00 Feb 19 '24 at 17:22
  • (1) this is correct (and straight-forward to prove, once you sort out the definitions). For (2) you seem to have 2 conditions -- $g(x)=0$ and $\nabla f(x)-\lambda \nabla g(x)=0$. The first says $x$ satisfies the constraint and the second that the gradient of $f$ is orthogonal to the constraint. tbc... – krm2233 Feb 20 '24 at 01:02
  • For (3) the contraints are indeed "AND"ed in that we are taking the intersection of the sets that satisfy each constraint individually. The theory of Lagrange multipliers assumes that $k$ constraints are "generic" in that, near $x_0$, the $k$ gradient vectors are linearly independent, so that by the Implicit Function Theorem, the locus which satisfies all of the constraint conditions is smooth. – krm2233 Feb 20 '24 at 02:39