1

There is a class of problems in Multivariable Calculus exemplified by, i.e... $$w = x^3 y - z^2 t,\ xy = zt,\ \text{Find}(\frac{\partial w}{\partial z})_{x,\ y}$$If I was to rewrite it in a form spiritually closer to a constrained optimization problem, I might eliminate $w$...$$(\frac{\partial}{\partial z})_{x,\ y} (x^3 y - z^2 t)$$$$s.t.\ \ \ \ xy = zt$$Two solution methods are available for this specific problem here, but my focus is on this class of problems in general, and if fewer variables would be more illustrative, then answers should ignore this example. The most unexpected part of the solution process is that $\frac{\partial t}{\partial z}$ is not $0$.

I understand how to solve these problems, but I don't have a good sense of what they are either geometrically and/or in relation to constrained optimization problems. The most natural thing to think given the term "constrained partial derivative" seems trivial upon further inspection, at least for analytic solutions. I wanted to view these as a sort of precursor to, or generalization of, constrained optimization problems, thinking of optimization as taking place across a surface in $\mathbb{R}^3$, each constraint as progressively restricting focus to smaller and smaller subsets of the points in the input space, and setting the Gradient of the objective to $0$ or proportional to the Gradient of the constraints as a sort of "quasi-constraint" further restricting focus to a smaller subset of input space points, at least in an extremely crude sense. And it doesn't take too much imagination to suppose that if one can take constrained partial derivatives then one can take constrained Gradients. So perhaps, the thought went, constrained derivatives are like a generalization of constrained optimization problems, in which we care about derivative information everywhere consistent with the constraints, not just at critical points. But this seems trivial, as mentioned, because an analytic solution to a derivative problem holds for all points in the input space, so in particular it holds for any subset of points defined by constraints.

Another idea might be that this class of problems is about taking derivatives over input spaces in which at least one dimension has been (possibly nonlinearly) transformed in some way, relative to the standard orthonormal basis, which might explain why $\frac{\partial t}{\partial z}$ is not $0$ above. If so, then constrained optimization may or may not be related in some way.

How can the geometry of this class of problems be understood? How is it related to constrained optimization, if at all?

user10478
  • 2,118

1 Answers1

1

A simpler example:

In the simplest kind of example of partial derivatives (without any subscripts), we might have $w=f\left(x,y,z\right)$ (representing some sort of 3d shape in 4d space), and then the expression $\left.\dfrac{\partial w}{\partial y}\right|_{\left(x,y,z\right)=\left(a,b,c\right)}$ means something like "intersect $w=f\left(x,y,z\right)$ with $x=a$ and $z=c$ so that you are left with a (1d) curve in a $yw$-plane, and take the slope of that curve (a limit of $\dfrac{\Delta w}{\Delta y}$) at $y=b$". Note that we intersected the 3d shape with two more equations ($x=a$ and $z=c$) in order to bring the dimension down to $1$ to do this.

And $\dfrac{\partial w}{\partial y}$ typically means the same thing as the above, except with arbitrary/generic $a,b,c$ written as $x,y,z$ instead.

Your example:

In your example, we have two equations in five variables. Typically, each equation cuts the dimension down by $1$, so those two equations represent some sort of 3d shape in 5d space. To illustrate this, note that in a region of the shape where $y>0$ (or where $y<0$), we could parametrize the region via $\left(y,z,t\right)\mapsto\left(zt/y,y,z,\left(zt/y\right)^{3}y-z^{2}t,t\right)$.

If we want to write an expression like $\dfrac{\partial w}{\partial z}$, then just like before, we need to intersect the shape with more equations to bring the dimension down to $1$ so that we have a curve. However, we now have choices of two variables to hold constant: $x$ and $y$, $x$ and $t$, or $y$ and $t$. All of those choices slice the 3d shape differently, and so can yield different answers at the same point in 5d space.

The unexpected part:

The most unexpected part of the solution process is that $\dfrac{\partial t}{\partial z}$ is not $0$.

This is just like $\dfrac{\partial w}{\partial z}$ above. $t$ is related to $z$ in two different ways: For convenience, assume we're looking at a point where $z>0$ (or where $z<0$). Then we have both $t=xy/z$ and $t=\left(x^{3}y-w\right)/\left(z^{2}\right)$. So this intersection means that $t$ as something like a graph of $\left(x,y,z,w\right)$ is not $4$-dimensional, but $3$-dimensional. We need to intersect with two more hyperplanes to get down to a curve, and we can choose which two to use (including the pair "$x$ and $y$", if relevant). In most cases, $\left(\dfrac{\partial t}{\partial z}\right)_{?,?}\ne0$.

How is it related to constrained optimization?:

Honestly, I see them as quite different, and hadn't thought of them together until now. I don't think I'd even heard these called “constrained partial derivatives” before, to be honest. I guess the hidden extra equations like $x=a$ are kind of like "constraints"?

If you wanted to optimize $w=f\left(x,y,z,t\right)$ subject to the constraint $xy=zt$ (with or without another constraint equation), that would be a constrained optimization problem. And you could solve it via Lagrange Multipliers by looking at the regular partials of $f$ and $g(x,y,z,t)=xy-zt$ and doing some algebra with those partials and $g=0$. Because you wouldn't have a "derivative with constraints", the calculation would be simpler in some ways. In that method, you're basically thinking of the relationship between the (unconstrained) graph of $f$ and the graph of $g$ to find the point where the geometry of the level sets suggests a possible extremum.

This kind of derivative is different in some key ways:

  1. The obvious way is that you'd evaluate $\left(\dfrac{\partial w}{\partial z}\right)_{x,y}$ at a (possibly generic) point, rather than searching through all points for the one that optimizes $w$.
  2. And you may or may not be really thinking about the unconstrained graph of $f$, depending on how you set up your calculation.
  3. Most importantly in my eyes, unlike in an optimization problem of any kind, not much would change for this kind of derivative if $w$ were not isolated: You could calculate $\left(\dfrac{\partial w}{\partial z}\right)_{x,y}$ at a particular point just fine if your first equation had been, say, $w^{2}t^{2}=\ldots$ instead of $w=\ldots$. But that change in the equation would mean you wouldn't have an obvious objective function for an optimization problem.

(Let me know if there's an aspect of your question I missed and I can probably edit something in.)

Mark S.
  • 25,893