This question may sound repetitive, but please give it a read. I read about Lagrange multipliers, and I know the formula. However, I was trying to get the intuition behind this. I read the posts How do Lagrange multipliers work to find the lowest value of a function subject to a constraint? and https://medium.com/@andrew.chamberlain/a-simple-explanation-of-why-lagrange-multipliers-works-253e2cdcbf74.
There is just one point which I'm unable to digest. Any explanations on it would be very helpful. What the above two posts say is : Therefore, if (,) was a maximum (or a minimum) of on (,)=, gradient vector of at (,) would be have dot product zero with every vector in the line tangent to at (,). But that means that the gradient vector of is orthogonal to the tangent line of , which means that the gradient of must be parallel to (i.e. a scalar multiple of) the gradient of .
If vector of f is orthogonal to the tangent line of g, how does it imply that gradient of f must by parallel to gradient of g. In 2-D space, it makes sense. But what if say "vector of f is along x", "tangent of g is along y", and "gradient of f is along z, but gradient of g is along x?". Is this case not possible? If not, why?