Lagrange multiplers method is used to fibd extrema of functions under constraints. So given you want to maximise/minimise a function $f:\mathbb{R}^n\rightarrow \mathbb{R}$ with $k$ constraints given by:
$g_{i}=0 : 1\le i\le k$ where $g_{i}:\mathbb{R}^n\rightarrow \mathbb{R}$
then extrema of $f$ will be at a point such that: $$\nabla f=\sum_{n=1}^{k}\lambda_{i} \nabla g_{i}$$ where $\lambda_i$ are constants called lagrange multiplers. This is the method. Will present proof of method using linear algebra and explain part I don't understand.
Outline proof of this method:
let there be a surface $S$ which is defined by $k$ equation: $$g_{i}=0 : 1\le i\le k$$ The surface $S$ has a dimension $n-k$
Now let there be point $p$ in $S$ then for a tangent plane to $S$ at $p$ call it $T_{p}$. Clearly each tangent vector $t \in T_{p}$ is orthogonal to $\nabla g_{i}(p)$
If we define a subspace $U$ where $U=\text{span}(\nabla g_{1}(p),\nabla g_{2}(p),..,\nabla g_{k}(p))$ then $t$ will be orthogonal to $U$. As $\text{dim}(T_{p})=n-k$ & $\text{dim}(U)=k $and all elements in $T_{p}$ are orthogonal to $U$ then $U$ is the orthogonal complement of $T_{p}$
Given that at this point $p$ is where $f$ is maximised under contraints $g_{i}$ then: $$\nabla f(p)\cdot t=0$$ as this holds $\forall t \in T_{p}$ that means $\nabla f(p)$ is orthogonal to $T_{p}$ thus lies in the orthognal complement of $T_{p}$ thus lies in $U$.
As $\nabla f(p) \in U$ & $U=\text{span}(\nabla g_{1}(p),\nabla g_{2}(p),..,\nabla g_{k}(p))$ that means that $\nabla f(p)$ is a linear combination of $\nabla g_{i}(p)$ $$\therefore \nabla f(p)=\lambda_{1} \nabla g_{1}+\lambda_{2}\nabla g_{2}+\lambda_{3}\nabla g_{3}+...+\lambda_{k}\nabla g_{k}$$
Part i don't understand:
Part I don't understand about this proof is why is $\text{dim}(S)=n-k$ , how do we always prove this is the case? I think it has to do with the fact that $S$ is defined by $r$ equations so is an intersection of $r$ level sets defined in $\mathbb{R}^{n}$ , but still need rigours proof.