1

Suppose $\Omega \subset \mathbb{R}^{m}$ is open and let $f: \Omega \to \mathbb{R}$ be differentiable. The derivative of $f$ at $x$, denoted by $df(x)$, is a linear functional, and it can be written as: $$df(x) = \sum_{i=1}^{m}\frac{\partial f}{\partial x_{i}}(x)dx_{i} \tag{1}\label{1}$$ where, for each $i =1,...,m$, $dx_{i}$ is just a notation for the dual basis $dx_{i}(v):= v_{i}$, for every $v=(v_{1},...,v_{m})$.

If $I \subset \mathbb{R}$ is an interval, consider $2n$ differentiable functions $q_{1},...,q_{n}, p_{1},...,p_{n}: I \to \mathbb{R}$. Suppose the variable in $I$ is denoted by $t$ and $f: \mathbb{R}^{n}\times \mathbb{R}^{n} \to \mathbb{R}$ is differentiable. By the chain rule: $$\frac{d f}{d t} = \sum_{i=1}^{n}\frac{\partial f}{\partial q_{i}}\frac{d q_{i}}{d t} + \sum_{i=1}^{n}\frac{\partial f}{\partial p_{i}}\frac{d p_{i}}{d t} \tag{2} \label{2}$$ Physicists usually write this as: $$df = \sum_{i=1}^{n}\frac{\partial f}{\partial q_{i}} dq_{i} + \sum_{i=1}^{n}\frac{\partial f}{\partial p_{i}}dp_{i} \tag{3}\label{3}$$

My question is: what does (\ref{3}) mean? How to understand it properly in terms of mathematics? I suppose it has something to do with (\ref{1}), but it is not clear to me because in the first case I am not using the chain rule and in the second I am.

MJD
  • 67,568
  • 43
  • 308
  • 617
  • 4
    If you understand (1), (3) is just the same thing but using different underlying names for the first $n$ variables from the second $n$ variables. – Ian May 12 '23 at 14:52
  • Typically, one regards the $q_i$ as the position variables, and the $p_i$ as the momenta. For example on the cotangent bundle $T^X$, the coordinates parameterising $X$ would be denoted $q_i$, and the coordinates parameterising the fibres would be denoted $p_i$. But you can just regard $T^X$ as a manifold in its own right, so the definitions are the same. – Quaere Verum May 12 '23 at 14:55
  • @Ian does (\ref{3}) have anything to do with (\ref{2}), then? I understand (\ref{3}) when seen as (\ref{1}), but I thought (\ref{3}) should relate to (\ref{2}) as well, but then there is the time variable. – InMathweTrust May 12 '23 at 15:04
  • 1
    From a physicist's perspective at least, (3) is a way of saying that (2) holds for any $p_i(t),q_i(t)$ with $f=f(p_1(t),\dots,p_n(t),q_1(t),\dots,q_n(t))$, without having to pin down what the parametrized path $(p,q)$ really is. There is an analogous way to think about (1), this is why I was suggesting that (1) is really where the "physicist weirdness" is. – Ian May 12 '23 at 18:09

1 Answers1

1

Like mentioned in the comments, if you understand (1), then you understand (3), since (3) is just an application of (1) on a $2m$-dimensional space, with the coordinate functions labelled in a funny manner (sure there’s a deeper reason for the split, but abstractly, it’s the exact same idea).

Now, let me offer you the following explanation in terms of differential geometry since that’s what you seem to be after. Suppose you have as in (1), an open set $\Omega\subset\Bbb{R}^m$, a smooth (differentiable is enough really) function $f:\Omega\to\Bbb{R}$ (actually you can replace $\Bbb{R}$ with any Banach space on the target). Now, suppose you have a smooth curve (again differentiable is enough) $\gamma:I\to \Omega$, where $I$ is an open interval in $\Bbb{R}$ and let us use the notation $t$ to denote the coordinate on $I$. We already have the 1-form $df$, and now we can consider the pullback $\gamma^*(df)$. By directly using the formula (1), and the basic rules for pullback (additivity, commuting with exterior derivative etc), we see that \begin{align} \gamma^*(df)&=\gamma^*\left(\sum_{i=1}^m\frac{\partial f}{\partial x^i}\,dx^i\right)\\ &=\sum_{i=1}^m\left(\frac{\partial f}{\partial x^i}\circ\gamma\right)\,d(x^i\circ\gamma)\\ &=\sum_{i=1}^m\left(\frac{\partial f}{\partial x^i}\circ\gamma\right)\,(x^i\circ\gamma)’\,dt, \end{align} where in the last equal sign, I am using the exact same idea as (1), in the special case that $m=1$ and the open set $\Omega$ is actually the interval $I$, and the function $f$ is simply $x^i\circ\gamma$, the $i^{th}$ component of the curve $\gamma$. On the other hand, we can simplify the left side of this equation to get $\gamma^*(df)=d(\gamma^*f)=d(f\circ\gamma)=(f\circ\gamma)’\,dt$, so in other words, \begin{align} (f\circ\gamma)’\,dt&= \sum_{i=1}^m\left(\frac{\partial f}{\partial x^i}\circ\gamma\right)\,(x^i\circ\gamma)’\,dt. \end{align} This is really the content of equation (2). If you like, note that since $dt$ is a non-vanishing 1-form on a 1-dimensional domain it follows that \begin{align} (f\circ\gamma)’&= \sum_{i=1}^m\left(\frac{\partial f}{\partial x^i}\circ\gamma\right)\,(x^i\circ\gamma)’, \end{align} and this is exactly what equation (2) says, but this way of writing it is more precise. If one wants to be lazy, then proceed by abusing notation and avoid mentioning $\gamma$ anywhere, and use the typical Leibniz notation to recover the beloved equation $\frac{df}{dt}=\sum_{i=1}^m\frac{\partial f}{\partial x^i}\frac{dx^i}{dt}$ (here the equal sign really means the two things are equal if the person reading/writing them knows their true meaning). But anyway, as you can see from here, this is nothing but the chain rule, and the concept of pullback is nothing more (here anyway) than substituting things correctly.

You can also consider a higher-dimensional analogue of (2), when you pullback not by a curve $\gamma:I\to\Omega$, but by a differentiable function $g:\Omega’\to\Omega$, where $\Omega’$ is an open set in some other $\Bbb{R}^k$, say. See also Clarifying the chain rule terminology in differential geometry calculuations for further remarks.

peek-a-boo
  • 65,833