Intuitive understanding of covariant derivative

Question

I've been self-studying Semi-Riemannian Geometry by Newman for a while now and have reached the section on curvature. At this point rather than rush through definitions and theorems, I want to understand the concepts properly. In accordance with SE guidelines, I'll break my doubts into multiple questions.

This question concerns intrinsic differential geometry on smooth manifolds. As a beginner to the subject, my understanding so far is:

A connection is a map $\nabla:\mathfrak{X}(M)\times\mathfrak{X}(M)\to\mathfrak{X}(M)$ satisfying a few properties
Because of those properties and a previous theorem, given a vector field $X$ and connection $\nabla$, there exists a tensor derivation $\nabla_X$ that can act on all tensors - this is the covariant derivative
Because of a couple of other results ($\nabla_X$ is defined pointwise), $\nabla_v$ is well-defined for any given vector $v$ on the manifold
Given a smooth curve $\gamma$ and $C^{\infty}(M)$ function $f$, we have the result $$\frac{d\gamma}{dt}(f)=\frac{d(f\circ\gamma)}{dt}$$ which gives me the intuition for $d\gamma/dt$ as the directional derivative operator (for a scalar field) along the curve $\gamma$
We can restrict some vector field $X$ to the curve to give a smooth vector field $X\circ\gamma\in\mathfrak{X}_M(\gamma)$. Then we also define the covariant derivative on $\gamma$, which is $\frac{\nabla}{dt}:\mathfrak{X}_M(\gamma)\to\mathfrak{X}_M(\gamma)$. This satisfies $$\frac{\nabla(X\circ\gamma)}{dt}(t_0)=\nabla_{(d\gamma/dt)(t_0)}(X)$$

I'm finding it difficult to get intuition about the covariant derivative. For example the intuition (for what $d\gamma/dt$ is) in point 4 can be read from the equation: we measure $f$ at $\gamma(t_0)$. Then, on making an infinitesimal change to $t$, we progress slightly "further along" the curve $\gamma$ and measure $f$ again. The change in $f$ is basically the $d\gamma/dt$ operator applied to $f$, so the operator is the directional derivative for a scalar field along $\gamma$.

Is there any similar reasoning that follows by looking at the expression for / properties of $\nabla_X$? (or $\nabla_v$ where $v\in T_p(M)$ for some $p\in M$)

peek-a-boo · Accepted Answer · 2023-02-07T21:41:57.917

5

One more thing you should add to the list is that you can define for each curve $\gamma:[a,b]\to M$, and each $a\leq t_1\leq t_2\leq b$ a linear mapping $P_{\gamma,t_1,t_2}:T_{\gamma(t_1)}M\to T_{\gamma(t_2)}M$, called the parallel-transport along $\gamma$ from $\gamma(t_1)$ to $\gamma(t_2)$. This is in fact a linear isomorphism, and it is defined by solving a certain ODE. So, now that you have this parallel-transport isomorphism, you can relate this to covariant derivatives pretty easily.

Let $v\in T_pM$ and let $X$ be a vector field on $M$. By virtue of $v$ being a tangent vector, I can find a curve $\gamma:(-\epsilon,\epsilon)\to M$ such that $\gamma(0)=p$ and $\gamma'(0)=v$. Then, you can show (essentially a 1-2 lines calculation by unwinding all the definitions...) that \begin{align} \nabla_vX&=\frac{d}{dt}\bigg|_{t=0}P_{\gamma,0,t}^{-1}(X(\gamma(t))). \end{align} What is this saying? Well, let's just unwind the formula. For each $t\in (-\epsilon,\epsilon)$ I have a vector $X(\gamma(t))\in T_{\gamma(t)}M$. Using the inverse parallel-transport map, I have $P_{\gamma,0,t}^{-1}:T_{\gamma(t)}M\to T_{\gamma(0)}M=T_pM$. Applying this on the vector $X(\gamma(t))$ gives me a resulting vector $\psi(t):=P_{\gamma,0,t}^{-1}(X(\gamma(t)))\in T_pM$; in other words, for each $t\in (-\epsilon,\epsilon)$, I have a vector $\psi(t)\in T_pM$, i.e a curve $\psi:(-\epsilon,\epsilon)\to T_pM$. This is one curve with values in a single vector space, so by basic calculus, I can calculate its derivative $\psi'(0)$. This is precisely what $\nabla_vX$ is.

So, the covariant derivative tells me how much a vector field $X$ changes (to first order... hence the first derivative) when I parallel-transport it back from various points on the curve to a single point $\gamma(0)$. So, if you were to abuse notation and ignore that things live in different vector spaces, you'd be tempted to write that for very small $t$, $X(\gamma(t))\approx X(\gamma(0))+t (\nabla_vX)$ (this is an abuse of notation since the LHS lives in $T_{\gamma(t)}M$ while the RHS lives in $T_pM$).

I should note that this procedure is more difficult to visualize, because unless the manifold and the connection you provide are very simple, it is pretty hard to visualize first of all the tangent bundle $TM$, and second of all what the parallel-transport map along different curves looks like (note that even if two curves have same endpoints, they may induce different parallel-transport maps (a sign of non-zero Riemann-curvature)). Parallel-transport is obtained by solving an ODE, so this really is more complicated. It is only for simple cases like the Euclidean space, or spheres that we can readily visualize things.

One other special case is if you have a manifold $M$ and an embedded submanifold $S$, and a Riemannian metric on $M$. Then, the Levi-Civita connection on $S$ is the orthogonal projection of the one on $M$. In particular, if $M=\Bbb{R}^n$, then the covariant derivatives are the good old directional derivatives of vector-valued functions, so to find the covariant derivative on a submanifold $S$, you orthogonally project to the tangent spaces of $S$ (this was very quick, but see e.g. Lee's Riemannian geometry text for more information).

edited Feb 07 '23 at 21:41

answered Feb 07 '23 at 21:31

peek-a-boo

65,833

So in terms of the terminology in point 5 of my question, the $\nabla_vX$ in the LHS of the equation you've given is essentially: $\frac{\nabla(X\circ\gamma)}{dt}(0)$? – Shirish Feb 07 '23 at 21:46
@ShirishKulhari yes. though I hate that notation especially with $\nabla$ on top. I've seen $\frac{D}{dt}$ more commonly. – peek-a-boo Feb 07 '23 at 21:47
1

maybe you don't want to get into it right now, because it could be an overwhelming amount of definitions, but eventually if you look at Ehresmann's very geometrically appealing definition of a connection in terms of horizontal subspaces, then some of my remarks in On which tangent bundles of $\Bbb{R}^2$ does position, velocity, acceleration live? may be helpful. – peek-a-boo Feb 07 '23 at 21:49
Thanks! Your explanation makes sense - so if I were to calculate the covariant derivative along a curve $\delta$ that is parallel w.r.t. the connection (i.e. $D/dt=0$ along the curve), then naturally there would be "no change" in a vector field along it. In the sense that if we have any random vector field $Y$ along that curve, then parallel transporting $Y(\delta(t))$ back to point $p$ would just give us $Y(\delta(0))$ irrespective of $t$ or $Y$. – Shirish Feb 07 '23 at 22:01
@ShirishKulhari actually no. The covariant derivative of $\delta$ being zero just means $\delta$ is a geodesic. That has no implication on the parallel-transport of $Y$ along $\delta$. For example, think of the sphere $S^2$. Geodesics are great circles. I can start at the north pole $N$ with a vector 'out of the page'. If I travel along differen geodesics to the south pole, I'll end up with different vectors (manifestation of the sphere having non-zero curvature). – peek-a-boo Feb 07 '23 at 22:09
But, what we can do is the converse (which is maybe what you're actually interested in). Start with a vector $v\in T_pM$, and a curve $\delta$. Then, I can parallel-transport $v$ along $\delta$ to get a vector field $V$ along the curve. This vector field $V$ will have zero covariant derivative at every point along the curve. – peek-a-boo Feb 07 '23 at 22:12
My bad. I think I shouldn't have written "irrespective of $t$". Because after all if $D/dt=0$, then all I can say is that the vector $Y(\delta(dt))$ infinitesimally away from $p$ along the curve, can be parallel transported back to $p$ to give $Y(\delta(0))$. Hopefully that correction works (still irrespective of $Y$)? – Shirish Feb 07 '23 at 22:13
Random question, but are you by any change comfortable with second order covariant derivatives? You've helped out with plenty of diff geom questions in the past, so I thought I'd reach out to you about a new question I've asked. If you're familiar with that topic, I'd be really grateful for any help.. – Shirish Feb 20 '23 at 11:09
@ShirishKulhari yes I know about second covariant derivatives, but the only thing I'll say is that you should read proper Riemannian geometry books, eg Lee's book. There's nothing I can say which isn't already mentioned there. Learn things systematically, and only use coordinates if it is a must or if it genuinely speeds up calculations (90% of the time it doesn't). – peek-a-boo Feb 20 '23 at 11:52
I'm trying to get a better conceptual grip on it all. Here is the question I'm referring to: https://math.stackexchange.com/questions/4639301/derivation-for-expression-for-second-covariant-derivative . Regarding books, yes that's exactly what I'm trying to do but different books follow wildly different conventions, approaches, etc. I'll continue my search even if it takes time. Meanwhile if you can give any help with the linked question (in case you're comfortable with the topic), then that would be very very helpful. – Shirish Feb 20 '23 at 12:02
@ShirishKulhari the identity you asked about is literally proved in Lee. – peek-a-boo Feb 20 '23 at 18:18
You mean in intro to smooth manifolds? But the covariant derivative isn't even mentioned in that book as far as I could search. Is there any other book you're referring to? The other thing is, I want to know how to resolve the inconsistency that are coming out through my calculations in that question... – Shirish Feb 20 '23 at 18:24
@ShirishKulhari into to Riemannian manifolds – peek-a-boo Feb 20 '23 at 18:29
Unfortunately I don't have that book in my library :( I only have intro to smooth manifolds. Also as I mentioned I'm still looking to figure out why inconsistencies are popping up in the calculation I made in that question. I have been at it for a while but no luck – Shirish Feb 20 '23 at 18:41
The inconsistencies are probably popping up because you don't have the right definitions / you have too many definitions floating around in your head. Like I said this is literally proved in the book, and it's a good exercise to read the right proof and figure out how to adapt your proof. – peek-a-boo Feb 20 '23 at 18:43
But I already mentioned I don't have the book. I will try to work it out somehow, but obviously if I have a wrong definition/assumption somewhere, then only somebody experienced in diff geom can help - exactly why I asked the question.. – Shirish Feb 20 '23 at 19:47
What's the ODE for parallel transport, if not the one given by the covariant derivative? For example, because I don't understand the covariant derivative, I don't understand the ODE here http://staff.ustc.edu.cn/~wangzuoq/Courses/16S-RiemGeom/Notes/Lec10.pdf. – D.R. Apr 20 '23 at 02:17
@D.R. yes it’s the ODE given from $\nabla$, i.e $\frac{du^{\alpha}}{dt}+\Gamma^{\alpha}_{i\beta}(\gamma(t))\dot{\gamma}^{i}(t) u^{\beta}(t)=0$. Here, $\gamma$ is the given curve (takes values in the manifold $M$), $\Gamma$ are the Christoffel symbols, $u$ is the curve we’re trying to solve for (takes values in the vector bundle $E$, say $TM$ for concreteness). – peek-a-boo Apr 20 '23 at 03:24
right but why is this a more intuitive definition of the covariant derivative? The purpose of your answer is to build up the covariant derivative from the supposedly more intuitive parallel transport, but this ODE does not seem easy to intuit. – D.R. Apr 20 '23 at 19:47
Anyway the reason I wrote this answer is because parallel transport is relatively intuitive on the sphere, yet OP makes no mention to the geometric interpretation of covariant derivatives via parallel-transport. – peek-a-boo Apr 20 '23 at 20:12
@D.R. see my penultimate paragraph. At the simplest level, start with a vector space $X$, and a fixed subspace $V\subset X$. Then, we choose a complement $H$, so that we have an internal direct sum decomposition $X=H\oplus V$. Then, for each $\xi\in X$, you have a unique decomposition $\xi=\xi_H+\xi_V$ with $\xi_H\in H$ and $\xi_V\in V$, or another way to say it is we have the projection maps $P_H:X\to H$ and $P_V:X\to V$, so that, $\xi_H=P_H(\xi)$ and $\xi_V=P_V(\xi)$. Now of course, unless $V=H$ (i.e $H={0}$), we don’t expect $\xi_V=\xi$… there will be some linear modification. – peek-a-boo Apr 20 '23 at 20:23
i.e $\xi_V=P_{V}(\xi)$, is some linear operator acting on $\xi$. This, is almost exactly what is happening for covariant derivatives using Ehresmann’s definition (see my first link above) and the weird $\Gamma$ terms. Except now we need to be more general. $X$ is now the tangent bundle $TE$, $V$ is the subbundle $\ker(T\pi)$, and now one has to choose a smooth complementary subbundle $H$ (basically do the above vector space decomposition smoothly at each point). The Christoffel symbols pop up merely because we’re projecting a vector $\xi$ (=$du/dt$ which lives in $TE$) to the subbundle $V$. – peek-a-boo Apr 20 '23 at 20:29
@peek-a-boo This is a really great answer! The construction you gave is similar to the Lie derivative of a vector field $Y$ with respect to another vector field $X$. I know the two are different and that the Lie derivative cannot be defined with respect to a single vector but I'm have trouble seeing what the difference is example. From what I know, this is because in coordinates the Lie derivative depends on the derivative of the coefficients of $X$ and hence depends on a neighborhood and not just a single point. – CBBAM Jul 30 '24 at 18:27
@peek-a-boo Is the difference between the two that the connection defines the parallel transport, whereas the flow lines of $X$ determine the "rate of change" for the Lie derivative? Visually it almost seems like they are doing the same thing. The difference between them becomes even blurrier when I compare $\nabla_XY$ and $\mathcal{L}_XY$, both seem to be defined by the flow lines of $X$. – CBBAM Jul 30 '24 at 18:28
@CBBAM yes that’s right, the connection fully defines the parallel transport maps along arbitrary (piecewise smooth) curves. We don’t need any other information. But the Lie derivative as you said uses the entire flow of the vector field to pullback the vector field you’re differentiating; we need a vector field to get started. – peek-a-boo Jul 30 '24 at 19:57
And another huge difference is that connections can be defined in any vector/principal bundle, but the Lie derivative only acts on sections of (tensor products of) the tangent bundle. This is because given a diffeomorphism $\Phi:M\to M$, there is a natural way to lift it to a diffeomorphism on the various tensor bundles, but there’s no natural lift to arbitrary vector bundles. Hence there’s no Lie derivative of sections of vector bundles (though you can try to define a ‘covariant Lie derivative’ relative to a connection). – peek-a-boo Jul 30 '24 at 20:02
@peek-a-boo So would it be correct to say that the covariant derivative in some sense lets you choose how the vector field $Y$ changes along the flow of the vector field $X$ by the choice of connection? – CBBAM Jul 30 '24 at 20:11
@CBBAM No, the covariant derivative $\nabla_XY$ has nothing to do with the flow of $X$. – peek-a-boo Jul 30 '24 at 23:26
@peek-a-boo Maybe I've misunderstood something conceptually, but I thought both the covariant derivative and the Lie derivative measure the infinitesimal change in $Y$ in the direction of $X$, and the latter is precisely the flow lines. Is this incorrect? – CBBAM Jul 31 '24 at 02:08
Let us continue this discussion in chat. – peek-a-boo Jul 31 '24 at 11:15

score 1 · Answer 2 · answered Feb 08 '23 at 00:02

The covariant derivative is the closest substitute for the directional derivative in $\mathbb{R}^n$. Think of the directional derivative as

\begin{align*}d:\mathfrak{X}(\mathbb{R}^n)\times C^\infty(\mathbb{R}^n)&\to C^\infty(\mathbb{R}^n)\\ X,f&\mapsto df(X)=Xf, \end{align*} where the last equality is notational. This notation is justified by the convention that when $X$ is the coordinate basis $\frac{\partial}{\partial x^i}$, we recover the calculus notation $df(\frac{\partial}{\partial x^i})=\frac{df}{dx^i}$. Notice we have two inputs:

One input takes in a vector field $X$ to take the directional derivative in.
The other input and another slot which tells you to take the derivative of which object.

Furthermore, observe that the final object agrees with the object you took a derivative of. If we fix a specific point $p\in \mathbb{R}^n$, the first input only depends on the value of the vector field at $p$. In other words, we say the first input is tensorial. Explicitly, for $f,g\in C^\infty(\mathbb{R}^n)$ and $X,Y\in \mathfrak{X}(\mathbb{R}^n)$, tensorality the following equality of functions, $$(gX+hY)f=g(Xf)+h(Yf).$$ One consequence is if $X=0$, then $Xf=0$. This is wildly false for the second input; you know from calculus that a function having a zero at $p$ does not mean its derivative will vanish at $p$. Instead, as you know, being a critical point depends on the local behavior of $f$. Locality illustrated by the second input obeying the Leibniz rule, $$X(f\cdot g)=Xf\cdot g+f\cdot Xg.$$ Leibniz rule corresponding to locality can be shown by a standard bump function argument that you should attempt yourself if you haven't seen.

The next step is to allow for us to take the derivative of more objects than just functions on $\mathbb{R}^n$. For example, we may want to take directional derivative of vector fields, and the result will be another vector field. Generally, the same should hold true for any tensor field.

To generalize all this for manifolds, we simply replace in the above discussion $\mathbb{R}^n$ with $M$, and $d$ with $\nabla$ to recover the definition of a covariant derivative. The reason for bullet points $4$ and $5$ essentially come down to tensoriality of the direction input. Tensoriality allows us to restrict the input which takes in the direction vector to "any codimension we like". This is why we may take derivatives along curves with $\nabla$.

Finally, contrast this with another notion of a derivative, namely the Lie derivative. In my view, the main reason, that the Lie derivative isn't a suitable replacement for the usual derivative, is exactly because it fails to be tensorial in the direction input. There is an exercise somewhere in Lee's book demonstrating exactly this.

I understand the last paragraph, but I'm always left wondering "so what is the Lie derivative good for?" How should I think of it if not as a directional derivative? I know it measures the failure of the flows to commute, but in what sense should I think of that as a derivative? — Charles Hudgins, Feb 08 '23 at 02:15
One of the most substantial uses I've seen is the Frobenius theorem. There are also various quantities in Riemannian geometry where you're forced to consider the Lie derivative. Generally speaking, in most cases I've seen the Lie derivative comes in the form of the Lie bracket. But my view is narrow, someone with more experience should probably give their perspective. — Mr. Brown, Feb 08 '23 at 02:50

Intuitive understanding of covariant derivative

2 Answers2