As this answer illustrates, a simple constrained optimization problem can be geometrically represented as finding the extrema of the intersection of two surfaces, where the surface representing the constraint is vertically invariant.
Using notation a bit differently than in the linked question, suppose we have a constrained optimization problem, notated in general as $$\begin{cases}z = f(x,\ y) \\ g(x,\ y) = C\end{cases}$$ The correct procedure for solving the problem involves setting gradients proportional to each other and solving a system of equations, but suppose we are confronted with such a problem on an exam, don't know how to solve it, and so try perhaps the most algebraically naturally thing; solving the constraint $g(x,\ y) = C$ for either $x$ or $y$, substituting the result into the objective function, and optimizing that result. This yields either $x = \bar g(y,\ C) \implies z = \bar f(y,\ C)$ or $y = \widetilde g(x,\ C) \implies z = \widetilde f(x,\ C)$, respectively, and since $C$ is just a known constant, in either case finding the extrema of $z$ would turn into a single variable optimization process. In fact, this process could be performed twice separately to account for optimizing both $x$ and $y$.
Why is this not a valid solution technique, especially from the geometric point of view? It seems like the geometry involved would still be the intersection of the same two surfaces. What quantity on the graphs would performing the optimization in the way I described actually pick out, if not the extrema of the intersection?