I would have a couple of follow-up questions to the following answer made by Royi:
The setup is the following:
$$ \begin{alignat*}{3} \text{minimize} & \quad & \frac{1}{2} \left\| A x - b \right\|_{2}^{2} \\ \text{subject to} & \quad & {x}^{T} x \leq 1 \end{alignat*} $$
The Lagrangian is given by:
$$ L \left( x, \lambda \right) = \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \frac{\lambda}{2} \left( {x}^{T} x - 1 \right) $$
The KKT Conditions are given by:
$$ \begin{align*} \nabla L \left( x, \lambda \right) = {A}^{T} \left( A x - b \right) + \lambda x & = 0 && \text{(1) Stationary Point} \\ \lambda \left( {x}^{T} x - 1 \right) & = 0 && \text{(2) Slackness} \\ {x}^{T} x & \leq 1 && \text{(3) Primal Feasibility} \\ \lambda & \geq 0 && \text{(4) Dual Feasibility} \end{align*} $$
1.) Why is the "Slackness" condition fulfilled, i.e. $\lambda\left( x^{T}x - 1 \right) = 0$? After all, we only know that $\left\vert\left\vert x\right\vert\right\vert_{2}^{2} \leq 1$, correct?
2.) Very related: Where do we know from that $\lambda\geq 0$?
3.) Here, the Lagrangian was defined by $\mathcal L \left( x, \lambda \right) := \frac{1}{2} \left\| A x - b \right\|_{2}^{2} + \frac{\lambda}{2} \left( {x}^{T} x - 1 \right)$. One can also define the Lagrangian by $\mathcal L \left( x, \lambda \right) := \frac{1}{2} \left\| A x - b \right\|_{2}^{2} - \frac{\lambda}{2} \left( {x}^{T} x - 1 \right)$, it shouldn't matter at the end. But in the latter case, would $\lambda\geq 0$ still hold?
4.) I would have a very general question concerning the KKT theorem: Can one also apply them when one deals with equality constraints?