Confusion on Notations of Partial Derivatives on Manifolds

Question

I'm confused with the notations of partial derivative on manifolds in Tu's An Introduction to Manifolds..

Just to make clear the notations I'm using, what I've known and which part I'm confusing, please let me to restate the question formally. Let $(M, \mathcal A)$ be an $n$-dimensional smooth manifold, $(U, \phi)=(U, x^i)$ be the chart containing some $p\in M$ and $f:U\to \mathbb R$ be a smooth function. The basis for $T_{\phi(p)}\mathbb R^n$ is $\left\{\left.\frac{\partial}{\partial r^i} \right\vert_{\phi(p)} \right\}_{i=1}^n$ and $r^i:\mathbb R^n\to \mathbb R, (a^1, a^2, \cdots, a^n) \mapsto a^i$ are the canonical coordinate functions, which means $x^i=r^i\circ\phi$. With these notation, the basis for $T_p U=T_pM$ is the preimage of those for $T_{\phi(p)}\mathbb R^n$ under the isomorphism $\phi_{*,p}$, i.e.

$$\left.\frac{\partial }{\partial x^i} \right\vert_p = \left(\phi^{-1} \right)_{*,\phi(p)} \left( \left.\frac{\partial}{\partial r^i} \right\vert_{\phi(p)} \right), i=1, \cdots n $$

Acting on $f$, we have

$$\left.\frac{\partial f}{\partial x^i} \right\vert_p = \left.\frac{\partial (f\circ \phi^{-1})}{\partial r^i} \right\vert_{\phi(p)}$$

I'm clear about the procedure of getting the basis as the preimages, but quite confused with the notations and the actual meaning.

According to this question (if the answer is right), the canonical coordinate functions $r^i:\mathbb R^n\to \mathbb R$ are actually the projections, then what does the expression $\frac{\partial (f\circ \phi^{-1})}{\partial r^i}|_{\phi(p)}$ exactly mean by taking partial derivatives on functions (which lie in function space)? And similarly for $\frac{\partial f}{\partial x^i}|_p$? By the usual meaning in calculus, isn't that supposed to take partial derivatives on variables lying in $\mathbb R^n$?

Personally I got several interpretations for this, but not quite sure whether they were right and I still feel some parts about the interpretation that doesn't make sense.

My first interpretation is "by abuse of notation". That is, we let $r^i=r^i(r)$ for $r\in \mathbb R^n$ and then $\frac{\partial (f\circ \phi^{-1})}{\partial r^i}|_{\phi(p)}$ becomes what we’ve meant. But we still cannot explain $\frac{\partial f}{\partial x^i}|_p$, since by our abuse, we get $x^i(p)=r^i \circ \phi(p))=r^i$. If we insist that $\partial()$ in the denominator be variables, we have to rewrite the left part as $\frac{\partial f}{\partial r^i}|_p$. This is absurd because $f$ is not the function of $r^i$. However, if we give up explaining $\frac{\partial }{\partial x^i}|_p$ on the left and regard it as a symbolic notation, or even rewrite it as $\partial_i$, somehow it then make sense.

However, this means that we have different interpretation on the symbols $\frac{\partial }{\partial x^i}|_p$ and $\frac{\partial }{\partial r^i}|_{\phi(p)}$, which in I personnally think should be consistent, since $\frac{\partial }{\partial x^i}|_p$ w.r.t. manifolds $M$ is just the more general case for $\mathbb R^n$.
My second interpretation is by regarding the coordinates and coordinate functions as the same. First consider $\mathbb R^n$, it itself is a manifold. Thus we can interpret it as an underlying topological manifold $M_{\mathbb R^n}$ with no coordinates at first and then was assigned each point a coordinate by specifying a chart $(M_{\mathbb R^n}, r^i)$ and the relative atlas, where $(r^i)$ is the cartesian orthogonal coordinates. The process of assigning coordinates is actually a coordinate map from $M_{\mathbb R^n}$ to the math object $\mathbb R^n$. That is, whenever we say the coordinates of some point $p$, we are actually saying the coordinate maps $(r^i):M_{\mathbb R^n} \to \mathbb R^n$.

With this interpretation and back to $\left.\frac{\partial f}{\partial x^i} \right\vert_p = \left.\frac{\partial (f\circ \phi^{-1})}{\partial r^i} \right\vert_{\phi(p)}$, we can regard both $\frac{\partial }{\partial x^i} ,\frac{\partial }{\partial r^i}$ as symbolic notation and redefine the notations for partial derivatives as the symbolic form “functions on functions”.

This version somehow seems more reasonable, but you may have noticed that I’ve change the definition of $r^i:{\mathbb R^n} \to \mathbb R$ (projection map) to $r^i: M_{\mathbb R^n} \to \mathbb R$ here (coordinate map). So the question is still unsolved.

What’s more, I also attempted to adapt the notation $\left.\frac{\partial f}{\partial x^i} \right\vert_p = \left.\frac{\partial (f\circ \phi^{-1})}{\partial r^i} \right\vert_{\phi(p)}$ on a concrete example, say the unit circle $S^1 \subset \mathbb R^2$. Let a chart be $(U, \phi)=(U, \rho, \theta)$ with polar coordinates and $q\in U$ in the first quadrant. Let $r^i$ be the coordinate function (i.e. projection) on $\mathbb R^2 \supset \phi(U)$ . Let’s use $(\xi, \eta)$ to describe the locations of points in $S^1$and $(\hat\xi, \hat\eta)$ the locations of points in $\mathbb R^2 \supset \phi(U)$.

Consider the function $f:U\to \mathbb R, (\xi, \eta) \mapsto \xi+\eta$. We have

$$ \begin{align*} \phi:(\xi, \eta)&\mapsto (\sqrt{\xi^2+\eta^2},\arctan \frac \xi\eta)\\ \rho=r^1\circ \phi:(\xi, \eta) &\mapsto \sqrt{ \xi^2+\eta^2}\\ \theta=r^2\circ \phi:(\xi, \eta) &\mapsto\arctan \frac \xi\eta \\ \phi^{-1}:(\hat\xi, \hat\eta)&\mapsto (\hat\xi \cos \hat\eta, \hat\xi \sin \hat\eta)\\ f \circ \phi^{-1}: (\hat\xi, \hat\eta) &\mapsto \hat\xi (\cos \hat\eta+ \sin \hat\eta) \end{align*} $$

Things are going right so far but the next thing is how can we compute $\left.\frac{\partial (f\circ \phi^{-1})}{\partial r^1} \right\vert_{\phi(q)}$?

Is it right to interpret it as computing partial derivatives on $\hat \xi$ , since $\frac{\partial }{\partial r^1}$ actually means to compute partial derivatives on the 1st variable? Then why not just use $\frac{\partial }{\partial \hat \xi}$ ?

Or does that mean we are supposed to use $(r^1, r^2)$ to describe the location of $\phi(q)$'s, and that my first interpretation should be right?

Well, suppose it’s right. We resymbolize $\phi^{-1}:(r^1, r^2)\mapsto (r^1 \cos r^2, r^1 \sin r^2)\,,\, f \circ \phi^{-1}: (r^1, r^2) \mapsto r^1 (\cos r^2+ \sin r^2)$, and let $\left.\frac{\partial (f\circ \phi^{-1})}{\partial r^i} \right\vert_{\phi(q)}$ be the usual meaning in calculus and $\left.\frac{\partial }{\partial \rho} \right\vert_{q}$ be symbolic. Then we have

$$ \left.\frac{\partial f}{\partial \rho} \right\vert_p = \left.\frac{\partial (f\circ \phi^{-1})}{\partial r^1} \right\vert_{\phi(p)} = \cos r^2+\sin r^2\\ \left.\frac{\partial f}{\partial \theta} \right\vert_p = \left.\frac{\partial (f\circ \phi^{-1})}{\partial r^2} \right\vert_{\phi(p)} = r^1 (\cos r^2-\sin r^2) $$

But the role of $\rho, r^1$ and $\theta, r^2$ are quite same, why not just use the same symbol? And what’s more, in the equation above, it’s $\left.\frac{\partial (f\circ \phi^{-1})}{\partial r^2} \right\vert_{\phi(p)}$ that’s more symbolic while $\left.\frac{\partial f}{\partial \theta} \right\vert_p$ somehow seems to have some practical meaning.

So what exactly do $\frac{\partial }{\partial x^i} ,\frac{\partial }{\partial r^i}$ and the coordinate functions mean?

score 4 · Answer 1 · answered Jul 13 '24 at 16:59

See this answer of mine for a direct answer. What follows is just some personal remarks. I actually don’t quite like the way Tu presents the definition.

Definitions.

Definition 1. (Partial derivative on open sets of $\Bbb{R}^n$)

Let $A\subset \Bbb{R}^n$ be open and $V$ any Banach space and let $f:A\to V$ be any map. For each point $a\in A$ and $i\in\{1,\dots, n\}$, if the limit (relative to the norm topology on $V$) $\lim\limits_{h\to 0}\frac{f(a+se_i)-f(a)}{h}$ exists in $V$ then we denote it by any of the symbols $(D_if)_a, (D_if)(a), D_if(a),(\partial_if)_a,(\partial_if)(a)$ or $\partial_if(a)$, and call this the $i^{th}$ partial derivative of $f$ at the point $a$. Finally, if this limit exists at each $a\in A$, we simply write the function as $D_if$ or $\partial_if$.

For the rest of my answer, I’ll use the notation $(\partial_if)(a)$.

Note carefully at this stage I do NOT denote this by “$\frac{\partial f}{\partial x^i}(a)$ where $x^i$ are Cartesian coordinates” or “$\frac{\partial f}{\partial r^i}(a)$ where $r^i$ are Cartesian coordinates” or any of the Leibnizian-like notation because when introduced so early on, people are bound to make TONS of mistakes, because of all the irrelevant notation in the denominator and their irresistable desire to treat things as fractions. See Notation for partial derivative of functions of functions for more details (and some related answers on the side). Note also that I do not speak of “partial derivatives with respect to variables” or “partial derivatives with respect to functions”. I simply speak of “the $i^{th}$ partial derivative (of a given function)”.

Ok having said that, let’s move on to more notation

Definition 2.

Let $M$ be a smooth manifold, $V$ a Banach space $f:M\to V$ a smooth map and $(U,\phi=(x_{\phi}^1,\dots, x_{\phi}^n))$ a coordinate chart. For each $p\in U$ and $i\in\{1,\dots, n\}$, we define \begin{align} \frac{\partial f}{\partial x_{\phi}^i}(p)\equiv\frac{\partial f}{\partial x_{\phi}^i}\bigg\rvert_{p}&:=\left(\partial_i(f\circ\phi^{-1})\right)(\phi(p)) \end{align} Or as equality of functions $U\subset M\to V$, we have $\frac{\partial f}{\partial x_{\phi}^i}:=(\partial_i(f\circ\phi^{-1}))\circ\phi$.

We call $\frac{\partial f}{\partial x_{\phi}^i}$ the $i^{th}$ partial derivative of the local-representation of $f$ relative to the chart $(U,\phi)$, or simply the $i^{th}$ partial derivative of $f$ relative to $(U,\phi)$, or by a further abuse of language, the partial derivative of $f$ relative to $x^i$ (I hate this third way of speaking when first introducing the notation/definitions).

Next, we define the symbol $\frac{\partial}{\partial x_{\phi}^i}\bigg|_p$ itself to mean the mapping $f\mapsto \left(\partial_i(f\circ\phi^{-1})\right)(\phi(p))$ (so by the definition of tangent spaces via derivations, this can be regarded as an element of $T_pM$). And finally, the symbol $\frac{\partial}{\partial x_{\phi}^i}$ means the mapping $f\mapsto \left(\partial_i(f\circ\phi^{-1})\right)\circ\phi$ (which can thus be regarded as a vector field on $U\subset M$).

In practice one of course only writes $x^i$ instead of $x_{\phi}^i$. But the reason why I wanted to write $x_{\phi}^i$ is so that when you look at the notation $\frac{\partial f}{\partial x_{\phi}^i}$ it immediately warns you that this quantity depends on

the function $f$ (obviously)
the index $i$ (obviously)
the entire coordinate chart $\phi$, and not just the single coordinate function $x_{\phi}^i$… this is a confusion which arises all too often when one writes $x^i$ and $\frac{\partial f}{\partial x^i}$ only, and so make the mistake of thinking that this is a notation for “partial derivative of a function with respect to a function” when that’s completely wrong. As I said in the definition, the notation $\frac{\partial f}{\partial x_{\phi}^i}$ is the $i^{th}$ partial derivative of $f$ “relative to a chart”.

But with these caveats mentioned, I shall revert to the usual custom of writing $x^i$ only.

Now, we immediately have the following theorem which justifies/connects the two definitions above

Theorem 1. (Justification of notation).

Let $A\subset\Bbb{R}^n$ be open, $V$ any Banach space, $f:A\to V$ smooth. Regarding $A$ as a smooth manifold in its own right, consider the identity chart $(A,\text{id}_A=(r^1,\dots, r^n))$. Then, we have \begin{align} \frac{\partial f}{\partial r^i}=\partial_if. \end{align}

The proof is an obvious one-liner: \begin{align} \frac{\partial f}{\partial r^i}&:=\left(\partial_i(f\circ\text{id}_A^{-1})\right)\circ\text{id}_A=\partial_if. \end{align}

Another name for the identity chart on an open set $A\subset\Bbb{R}^n$ is Cartesian coordinates (on $A$). So you could express this ‘theorem’ by saying that partial derivatives relative to Cartesian coordinates (in the sense of definition 2) are the same thing as the “slot partial derivatives” (in the sense of definition 1, which is the usual thing we all learn). And thus we have come full circle, by connecting the modern notation $\frac{\partial f}{\partial r^i}$ (or more precisely, $\frac{\partial f}{\partial \text{id}_A^i}$) with $\partial_if$ (or what we would have classically denoted by $\frac{\partial f}{\partial r^i}$ and said this is the partial derivative of the function $f=f(r)$ relative to the variable $r^i$… but like I mentioned in my linked answer above, I abhor this use of language).

But just to summarize the logic of things:

first we introduced the usual partial derivatives $\partial_i$ for maps $A\subset\Bbb{R}^n\to V$.
with that definition at hand, and a coordinate chart $(U,\phi)$ on a manifold $M$, we are able to define $\frac{\partial}{\partial x_{\phi}^i}$, which acts on maps $M\to V$
finally we can check that for maps $A\subset \Bbb{R}^n\to V$, we have $\frac{\partial}{\partial \text{id}_A^i}$ (or as you denote it, $\frac{\partial}{\partial r^i}$) equals $\partial_i$. This is a consistency check.

Your example.

Your example doesn’t really make sense since $S^1$ is one dimensional but you’re trying to use two coordinates (i.e it seems like you’re mixing up polar coordinates for the first quadrant vs just a single angular coordinate for the first quadrant of $S^1$). But anyway, regardless of which you actually intended, you can analyze things by following the three steps above.

Having written all of this, I think you should just read directly from a master; almost everything I know/say about notation in these matters comes essentially verbatim from Spivak. In particular, read Spivak’s Vol I, page 35-40 ish. He defines notation, and then he goes through polar coordinates carefully. — peek-a-boo, Jul 13 '24 at 17:50
I don’t see any need to read Spivak. Just read this answer and try to understand every line of it inside out. It might require several tries over some period of time, but it’ll be well worth the effort. A big challenge of differential geometry is the notation. Rigorously defined notation eventually becomes too cumbersome, but I suggest using it at the start. This means rewriting everything you see in Tu and other books into explicit unambiguous notation. In time you get tired of doing it and see how to abbreviate your formulas. Then the abuse of notation starts to make more sense. — Deane, Jul 13 '24 at 18:52
I would add that since different sources use different notation, you eventually have to choose or design your own notation and rewrite everything using that. My adviser used to ask me “have you invented your own notation yet?” — Deane, Jul 13 '24 at 18:54
X: I just want someone who doesn't complain about my differential geometry notation. Y: you joking? I complain about my own differential geometry notation. — Jackozee Hakkiuz, Jul 14 '24 at 00:21

score 0 · Answer 2 · answered Jul 31 '24 at 15:04

In Chapter 2.1 Tu considers the directional derivatives of ordinary real-valued functions $f : U \to \mathbb R$ defined on an open neighborhood of a point $p \in \mathbb R^n$. These can be written as linear combinations the standard partial derivatives $\dfrac{\partial f}{\partial x^i}\mid_p = \dfrac{\partial f}{\partial x^i}(p)$. The "denominator" $\partial x^i$ has only a symbolic meaning: We take the partial derivative with respect to the $i$-th coordinate. That is, if we write points of $\mathbb R^n$ in the form $(x^1,\ldots,x^n)$, then we also write $\dfrac{\partial f}{\partial x^i}(p)$. However, the naming of the coordinates is completely irrelevant, we may take (according to taste) upper or lower indices or anything else. For example, if we write $(x,y,z)$ for points of $\mathbb R^3$, we get $\dfrac{\partial f}{\partial x}(p), \dfrac{\partial f}{\partial y}(p), \dfrac{\partial f}{\partial z}(p)$. A neutral substitute for $\dfrac{\partial f}{\partial x^i}$ would be $\partial_i f$ - but it would be somewhat exotic to use this notation.

In Chapter 2.3 Tu gives a coordinate free approach to the tangent space $T_p(\mathbb R^n)$ by identifying it with the vector space $\mathscr D_p(\mathbb R^n)$ of derivations at $p$ which are defined as linear maps $D : C^\infty_p \to \mathbb R$ satisfying the Leibniz rule. This abstract concept of "tangent space" turns out as adequate to be generalized to arbitrary smooth manifolds.

I think your confusion results from the following text passages:

p.53

In the context of manifolds, we denote the standard coordinates on $\mathbb R^n$ by $r^1, \ldots, r^n$. If $(U,\phi : U \to \mathbb R^n)$ is a chart of a manifold, we let $x^i = r^i \circ \phi$ be the $i$-th component of $\phi$ and write $\phi =(x^1, \ldots ,x^n)$ and $(U,\phi) =(U,x^1, \ldots ,x^n)$. Thus, for $p \in U$, $(x^1(p),\ldots,x^n(p))$ is a point in $\mathbb R^n$. The functions $x^1, \ldots ,x^n$ are called coordinates or local coordinates on $U$. By abuse of notation, we sometimes omit the $p$. So the notation $(x^1, \ldots ,x^n)$ stands alternately for local coordinates on the open set $U$ and for a point in $ \mathbb R^n$.

p.67

6.6 Partial Derivatives

On a manifold $M$ of dimension $n$, let $(U,\phi)$ be a chart and $f$ a $C^\infty$ function As a function into $\mathbb R^n$, $\phi$ has $n$ components $x^1, \ldots, x^n$. This means that if $r^1, \ldots, r^n$ are the standard coordinates on $\mathbb R^n$, then $x^i = r^i \circ \phi$. For $p \in U$, we define the partial derivative $\partial f \partial x^i$ of $f$ with respect to $x^i$ at $p$ to be $$\frac{\partial f}{\partial x^i} \mid_p := \frac{\partial f}{\partial x^i}(p) := \frac{\partial f}{\partial x^i}(p) = \frac{\partial(f \circ \phi^{-1})}{\partial r^i}(\phi(p)) := \frac{\partial(f \circ \phi^{-1})}{\partial r^i} \mid_{\phi(p)} .$$ Since $p = \phi^{-1}(\phi(p))$, this equation may be rewritten in the form $$\frac{\partial f}{\partial x^i}(\phi^{-1}(\phi(p))) = \frac{\partial(f \circ \phi^{-1})}{\partial r^i}(\phi(p)) .,$$ Thus, as functions on $\phi(U)$, $$\frac{\partial f}{\partial x^i} \circ \phi^{-1} = \frac{\partial(f \circ \phi^{-1})}{\partial r^i} .$$ The partial derivative $\partial f / \partial x^i$ is $C^\infty$ on $U$ because its pullback $\partial f/ \partial x^i \circ \phi^{-1}$ is $C^\infty$ on $\phi(U)$.

The use of the word "coordinates" is a bit ambiguous. On p. 53 it is used in the sense of "coordinate functions". In fact, let $(U,\phi)$ be a chart on $M$. According to Tu $\phi$ is a map $\phi : U \to \mathbb R^n$ which maps $U$ homeomorphically onto an open subset of $\mathbb R^n$. Letting $\pi^i : \mathbb R^n \to \mathbb R$ denote the projection onto the $i$-th coordinate, we get the $n$ coordinate functions $\pi^i \circ \phi : U \to \mathbb R$ of $\phi$. Tu writes $x^i = \pi^i \circ \phi$ which is a highly symbolic notation. Perhaps it would be better to write $x^i_\phi$ or something else, but let us accept Tu's convention. The idea is

By abuse of notation, we sometimes omit the $p$. So the notation $(x^1, \ldots ,x^n)$ stands alternately for local coordinates on the open set $U$ and for a point in $ \mathbb R^n$.

Here you see the ambiguity: The $x^i$ are coordinate functions, but sometimes they should be understood as the coordinates of a point $x \in \mathbb R^n$.

On $\mathbb R^n$ we have the canonical chart $id_{\mathbb R^n} : \mathbb R^n \to \mathbb R^n$ and in this special case Tu writes $r^i = \pi_i \circ id_{\mathbb R^n} = \pi^i$. So yes, $r^i$ is the projection onto the $i$-th coordinate. I guess Tu's motivation is to reserve $r^i$ for the "standard Euclidean case" so that the reader knows at first glance that Tu works on $\mathbb R^n$ or on some open subset of $\mathbb R^n$. The $x^i$ notation is used in the general manifold case.

$\dfrac{\partial}{\partial r^i}$ is nothing else than the usual partial derivative which applies to functions living on open subsets of $\mathbb R^n$. This was denoted by $\dfrac{\partial}{\partial x^i}$ in Chapter 2.1, but Tu changed notation to indicate that $\dfrac{\partial}{\partial r^i}$ means the standard Euclidean case. It is just a mnemonic help for the reader. When we see $\dfrac{\partial}{\partial x^i}$, we know that we work in the general manifold case which involves a chart $\phi$.

$\dfrac{\partial f}{\partial r^i}(p)$ does not mean that Tu wants to introduce a concept a partial derivative with respect to a function. It stands for the ordinary partial derivative with respect to the $i$-th coordinate. In fact, there is no reasonable way to define a partial derivative of a function $f : U \to \mathbb R$ with respect to another function $g : U \to \mathbb R$ (here: $r^i$).

On open subsets of manifolds we do not have a natural addition and a natural scalar multiplication, thus the standard definiton of partial or directional derivatives used in calculus does not apply. However, we can use any chart $\phi$ to get "Euclidean coordinates" locally on $M$. The precise definition of $\dfrac{\partial f}{\partial x^i}(p)$ as a derivation in $\mathscr D_p( M)$ was given on p.67. It necessarily involves $\phi$, though $\phi$ is hidden in $x^i$. In some vague sense we cantry to imagine $\dfrac{\partial f}{\partial x^i}(p)$ as the partial derivative of $f$ at $p$ with respect to the coordinate function $x^i$, but we should not to do so. Just take the proper definition and understand that $x^i$ gives a "direction" in a local coordinate system.

Confusion on Notations of Partial Derivatives on Manifolds

2 Answers2

Linked