Generalised Lagrange multipliers proof using linear algebra

Question

Lagrange multiplers method is used to fibd extrema of functions under constraints. So given you want to maximise/minimise a function $f:\mathbb{R}^n\rightarrow \mathbb{R}$ with $k$ constraints given by:

$g_{i}=0 : 1\le i\le k$ where $g_{i}:\mathbb{R}^n\rightarrow \mathbb{R}$

then extrema of $f$ will be at a point such that: $$\nabla f=\sum_{n=1}^{k}\lambda_{i} \nabla g_{i}$$ where $\lambda_i$ are constants called lagrange multiplers. This is the method. Will present proof of method using linear algebra and explain part I don't understand.

Outline proof of this method:

let there be a surface $S$ which is defined by $k$ equation: $$g_{i}=0 : 1\le i\le k$$ The surface $S$ has a dimension $n-k$

Now let there be point $p$ in $S$ then for a tangent plane to $S$ at $p$ call it $T_{p}$. Clearly each tangent vector $t \in T_{p}$ is orthogonal to $\nabla g_{i}(p)$

If we define a subspace $U$ where $U=\text{span}(\nabla g_{1}(p),\nabla g_{2}(p),..,\nabla g_{k}(p))$ then $t$ will be orthogonal to $U$. As $\text{dim}(T_{p})=n-k$ & $\text{dim}(U)=k $and all elements in $T_{p}$ are orthogonal to $U$ then $U$ is the orthogonal complement of $T_{p}$

Given that at this point $p$ is where $f$ is maximised under contraints $g_{i}$ then: $$\nabla f(p)\cdot t=0$$ as this holds $\forall t \in T_{p}$ that means $\nabla f(p)$ is orthogonal to $T_{p}$ thus lies in the orthognal complement of $T_{p}$ thus lies in $U$.

As $\nabla f(p) \in U$ & $U=\text{span}(\nabla g_{1}(p),\nabla g_{2}(p),..,\nabla g_{k}(p))$ that means that $\nabla f(p)$ is a linear combination of $\nabla g_{i}(p)$ $$\therefore \nabla f(p)=\lambda_{1} \nabla g_{1}+\lambda_{2}\nabla g_{2}+\lambda_{3}\nabla g_{3}+...+\lambda_{k}\nabla g_{k}$$

Part i don't understand:

Part I don't understand about this proof is why is $\text{dim}(S)=n-k$ , how do we always prove this is the case? I think it has to do with the fact that $S$ is defined by $r$ equations so is an intersection of $r$ level sets defined in $\mathbb{R}^{n}$ , but still need rigours proof.

First of all, one has to define the notion of a "submanifold of $\Bbb{R}^n$ of dimension ___". If you're unfamiliar with it then take a look at this answer of mine (in particular, definition 2). Translated into more simple terminology, the requirement is that ${\nabla g_i(p)}_{i=1}^k$ be linearly independent vectors (or equivalently that $Dg(p)$ have full rank). So, the answer to your question is that it is almost by definition. — peek-a-boo, May 21 '21 at 04:08
So, it's not a rigorous proof, but a rigorous definition that you need. the motivation for the definitions comes from linear algebra (as you allude to); in particular the rank-nullity theorem: if $T:V\to W$ is a linear transformation between finite-dimensional vector spaces, then $\dim \ker T = \dim V - \text{rank}(T)$. Recall that the kernel is simply the level set (subspace) $T^{-1}({0})$, and the rank is the dimension of the image of $T$. So, it is this linear algebra result combined with the inverse/implicit function theorem (as mentioned in that post) which motivate our definitions. — peek-a-boo, May 21 '21 at 04:19

Generalised Lagrange multipliers proof using linear algebra

0 Answers0