0

Is there any intuition behind the implicit function theorem? Take $F(x,y)=0$ where $y=f(x)$. Then,

$$ \frac{dy}{dx}=-\frac{F_x}{F_y} $$

I see that the derivative is equal to the reciprocal of the partial derivatives, what is the intuition behind this?

2 Answers2

2

The geometric intuition is that the unit tangent vector, say $T$, to the surface $F=0$ is perpendicular to the gradient of $F$ at each point. This is because the directional derivative of $F$ in the tangential direction is $\nabla F \cdot T$, which would be nonzero if the tangent weren't perpendicular to the gradient, which intuitively implies that $F$ would vary along the surface. Once you know that, this relation follows by just taking the ratio of the $y$ component of the tangent vector and the $x$ component of the tangent vector.

Another mnemonic for this relation is to formally write $dF=F_x dx + F_y dy = 0$, which intuitively means that a change in one variable must be matched by a change in the other such that the change in $F$ will vanish (a slightly less rigorous way of saying $\nabla F \cdot T=0$). Then formally solve for $\frac{dy}{dx}$.

Ian
  • 104,572
0

The intuition for the implicit function theorem as I understand it is: the solutions to $F(x,y)=0$ are a union of graphs of functions.

In your case, it looks like you are talking about one function you've already picked, $y=f(x)$. Sometimes, you might need to have $x=g(y)$ to cover the whole solution set.

One of the typical examples given is $F(x,y)=x^2+y^2 -1$.

Since $dF/dy=2y$ is nonzero whenever you are not on the $x$-axis, you're guaranteed that the points outside of the $x$-axis can be represented as a union of graphs.

In this case, the graph of $f(x)=\sqrt{1-x^2}$ and $g(x)=-\sqrt{1-x^2}$ for $x\in [-1, 1]$ might be one way to express it.

However, some authors require that the domains of these functions be open sets. In that case, it would be impossible to pick up the points of the graph above using functions of $x$. But, thankfully, we could also use the equivalent functions of $y$ to cover those points:

$f'(y)=\sqrt{1-y^2}$ and $g'(y)=-\sqrt{1-y^2}$

They are able to cover everything where $dF/dx$ is nonzero (which is everything off the $y$-axis.)


From your example it looks like you are rather looking for an explanation of why this applies to implicit differentiation, to which I would say: since you can "zoom in" to one of these functions no matter where you are in the solution set, you'll always be able to learn about the tangent (locally) using regular differentiation (since they are functions.)

rschwieb
  • 160,592
  • This is really one place where being confined to the notion of $y$ as a function of $x$ or vice versa is quite limiting. It can be helpful to throw up your hands in the style of the physicists and just say that everything lives on some manifold and we can locally solve for any one variable in terms of a number of the others equal to the dimension of the manifold. – Ian Mar 18 '20 at 14:36
  • @Ian I'm not sure if it's a confinement or a powerful reduction. I had my doubts about the niceness of the situation at first too, but I thought later it actually seemed to hold water. Specifically, I've been learning about it from this book, which is pretty good (even by mathematician standards:) http://wwwf.imperial.ac.uk/~dholm/classnotes/GMS-FinalMar09.pdf Attention is focused on "embedded submanifolds," which may explain the niceness of the version of the theorem I'm talking about. – rschwieb Mar 18 '20 at 14:38
  • @Ian Is there a good counterexample I should know about which is not locally representable by "graphs of functions of the standard basis coordinates"? – rschwieb Mar 18 '20 at 14:42
  • You can do it, that's what the implicit function theorem says you can do, but the issue is that mathematicians usually write things in a way that makes it sound like some variables are somehow inherently the dependent ones while others are inherently the independent ones. But really, at "completely nondegenerate" points (where none of the partials of $F$ vanish), you can just solve for any one variable in terms of the other $n-1$ variables, and you could even decide to do it differently on neighborhoods of different points. – Ian Mar 18 '20 at 14:46
  • (Cont.) It's just a matter of our convenience which variables should be in which category. This is is why the $dF=0$ equation has an advantage over the form in the OP since it is symmetric in the variables and doesn't require $F_y \neq 0$. $F_y \neq 0$ is only used at the last step where you go to write $y$ as a function of $x$. In the end it is all equivalent, but for pedagogy and especially for intuition, the way we describe things makes a difference. – Ian Mar 18 '20 at 14:46
  • I agree with all that, especially what basically says "coordinate free is better," but it seems like we rarely can have any exposition that is truly coordinate free. At least, in practice. – rschwieb Mar 18 '20 at 14:50