Deriving the gradient operator in spherical coordinates

Question

This is a sort of problem where I know what to do but do not completely understand what I am doing.

I have been taught how to derive the gradient operator in spherical coordinate using this theorem

$$\vec{\nabla}=\hat{x}\frac{\partial}{\partial x}+\hat{y}\frac{\partial}{\partial y}+\hat{z}\frac{\partial}{\partial z}=a\hat{r}\frac{\partial}{\partial r}+b\hat{\theta}\frac{\partial}{\partial \theta}+c\hat{\phi}\frac{\partial}{\partial \phi}$$

where a, b, c can be found using this 2 step method.

Derive the holonomic spherical bases by applying the chain rule.

$$\begin{cases}\frac{\partial}{\partial r} = \frac{\partial x}{\partial r}\frac{\partial}{\partial x}+\frac{\partial y}{\partial r}\frac{\partial}{\partial y}+\frac{\partial z}{\partial r}\frac{\partial}{\partial z}\\\frac{\partial}{\partial \theta} = \frac{\partial x}{\partial \theta}\frac{\partial}{\partial x}+\frac{\partial y}{\partial \theta}\frac{\partial}{\partial y}+\frac{\partial z}{\partial \theta}\frac{\partial}{\partial z}\\\frac{\partial}{\partial \phi} = \frac{\partial x}{\partial \phi}\frac{\partial}{\partial x}+\frac{\partial y}{\partial \phi}\frac{\partial}{\partial y}+\frac{\partial z}{\partial \phi}\frac{\partial}{\partial z}\end{cases}\\\begin{cases}\frac{\partial}{\partial r} = \cos\theta\sin\phi\frac{\partial}{\partial x}+\sin\theta\sin\phi\frac{\partial}{\partial y}+\cos\phi\frac{\partial}{\partial z}\\\frac{\partial}{\partial \theta} = -r\sin\theta\sin\phi\frac{\partial}{\partial x}+r\cos\theta\sin\phi\frac{\partial}{\partial y}\\\frac{\partial}{\partial \phi} = r\cos\theta\cos\phi\frac{\partial}{\partial x}+r\sin\theta\cos\phi\frac{\partial}{\partial y}-r\sin\phi\frac{\partial}{\partial z}\end{cases}\\\begin{cases}\hat{\tilde{r}} = \cos\theta\sin\phi\hat{x}+\sin\theta\sin\phi\hat{y}+\cos\phi\hat{z}\\\hat{\tilde{\theta}} = -r\sin\theta\sin\phi\hat{x}+r\cos\theta\sin\phi\hat{y}\\\hat{\tilde{\phi}} = r\cos\theta\cos\phi\hat{x}+r\sin\theta\cos\phi\hat{y}-r\sin\phi\hat{z}\end{cases}$$

Normalize the holonomic bases using the orthonormality of the cartesian bases.

$$\begin{cases}\hat{r}=a\hat{\tilde{r}}\\\hat{r}\cdot\hat{r}=a^2(\cos^2\theta\sin^2\phi+\sin^2\theta\sin^2\phi+\cos^2\phi)\\a=1\end{cases}\\\begin{cases}\hat{\theta}=b\hat{\tilde{\theta}}\\\hat{\theta}\cdot\hat{\theta}=b^2(r^2\sin^2\theta\sin^2\phi+r^2\cos^2\theta\sin^2\phi)\\b=\frac{1}{r\sin\phi}\end{cases}\\\begin{cases}\hat{\phi}=c\hat{\tilde{\phi}}\\\hat{\phi}\cdot\hat{\phi}=c^2(r^2\cos^2\theta\cos^2\phi+r^2\sin^2\theta\cos^2\phi+r^2\sin^2\phi)\\c=\frac{1}{r}\end{cases}$$

I do not understand where the theorem comes from.

I understand using chain rule spherical bases can be expanded into cartesian ones if I assume that the partial derivative operators are equal to basis vectors, but why am I even allowed to assume so?

I do not understand why am I normalizing the holonomic bases then multiplying (instead of dividing) the normal bases again with the normalizing factor.

I would be thankful if someone clarifies what is actually going on.

Kurt G. · Answer 1 · 2023-07-02T16:48:44.487

"The career of a young differential geometer consists of treating the chain rule in ever increasing levels of abstraction." (after a famous quote by Sidney Coleman)

You can view the operators $\partial_x,\partial_y,\partial_z,\partial_r,\partial_\theta,\partial_\phi$ as basis vector fields because of the chain rule: For a curve $t\mapsto\gamma(t)$ in $\mathbb R^3\,,$ and any smooth enough function $f\,,$ the directional derivative of $f$ along the tangent of $\gamma$ is $$ \frac{d}{dt}(f\circ\gamma)=(\partial_xf)\,\dot\gamma_1+(\partial_yf)\,\dot\gamma_2+(\partial_zf)\,\dot\gamma_3\,. $$ Drop $f$ and consider the RHS as a linear combination of the basis vector fields $\partial_x,\partial_y,\partial_z$ with the components of the tangent vecor at $\gamma\,.$ In other words, $$ \dot\gamma_1\partial_x+\dot\gamma_2\partial_y+\dot\gamma_3\partial_z $$ is a tangent vector.
Transformation laws to other coordinate systems are again nothing else then a consequence of the chain rule. For simplicity I take 2D polar coordinates: \begin{align} \partial_rf&=\partial_xf\,\partial_rx+\partial_yf\,\partial_ry=\cos\phi\,\partial_xf+\sin\phi\,\partial_yf\,,\\[2mm] \partial_\phi f&=\partial_xf\,\partial_\phi x+\partial_yf\,\partial_\phi y=-r\sin\phi\,\partial_xf+r\cos\phi\,\partial_yf\,. \end{align} In matrix notation, and dropping again $f$: $$ \begin{pmatrix}\partial_r\\ \partial_\phi\end{pmatrix}=\begin{pmatrix}\cos\phi & \sin\phi\\ -r\sin\phi & r\cos\phi\end{pmatrix}\begin{pmatrix}\partial_x\\\partial_y\end{pmatrix}\,. $$ A more common way of writing this is to absorb the $r$ into the second basis vector field: $$ \begin{pmatrix}\partial_r\\ \frac{1}{r}\partial_\phi\end{pmatrix}=\begin{pmatrix}\cos\phi & \sin\phi\\ -\sin\phi & \cos\phi\end{pmatrix}\begin{pmatrix}\partial_x\\\partial_y\end{pmatrix}\,. $$ In this format the transformation matrix is orthogonal (strictly speaking orthnormal). The same is true for the Cartesian and polar basis vector fields: they are both orthonormal.
It is however not necessary to do that normalization. One may as well work with the socalled holonomic, or coordinate basis $\partial_r,\partial_\phi\,.$ This is further discussed here.

If I do not normalize then $\vec{\nabla}=\frac{\hat{\tilde{r}}}{a}\frac{\partial}{\partial{r}}+\frac{\hat{\tilde{\theta}}}{b}\frac{\partial}{\partial{\theta}}+ \frac{\hat{\tilde{\phi}}}{c}\frac{\partial}{\partial{\phi}}$ right? But I still need to find a, b, c by normalizing though. I know they are called the diagonal components of the inverse metric tensor, but I have just started reading about differential geometry and do not know a lot about them confidently. — uran42, Jul 03 '23 at 13:11

Nicholas Todoroff · Accepted Answer · 2023-07-13T00:24:22.470

$ \newcommand\vec{\mathbf} \newcommand\pd[2]{\partial#1/\partial#2} \newcommand\PD[2]{\frac{\partial#1}{\partial#2}} $I think that Kurt G.'s answer addresses your question directly, but I want to offer a different perspective that I feel is less "mechanical".

Let's say that $\nabla$ differentiates the vector variable $\vec x$. For any vector $\vec v$, we should recognize that $\vec v\cdot\nabla$ is precisely the $\vec v$-directional derivative operator; this can actually be taken as a definition of $\nabla$. More precisely, $\nabla$ is such that $$ (\vec v\cdot\nabla)f(\vec x) = \lim_{\epsilon\to0}\frac{f(\vec x + \epsilon\vec v) - f(\vec x)}{\epsilon} \tag{Dir} $$ for any sufficiently differentiable function $f$. It is trivial then that $$ (\vec v\cdot\nabla)\vec x = \vec v. \tag{Triv} $$ (Dir) gives us precisely the isomorphism between vectors and differential operators: $$ \vec v \longleftrightarrow \vec v\cdot\nabla. \tag{Iso} $$

If we have a basis $\{\vec e_1, \vec e_2, \vec e_3\}$, then how can we get the components of an arbitrary vector $\vec v$ in this basis? Well, since we can write $$ \vec v = v^1\vec e_1 + v^2\vec e_2 + v^3\vec e_3 $$ for scalars $v^1, v^2, v^3$ what we can do is look for vectors $\vec e^1, \vec e^2, \vec e^3$ such that $$ \vec e^1\cdot\vec e_1 = \vec e^2\cdot\vec e_2 = \vec e^3\cdot\vec e_3 = 1, $$ $$ \vec e^1\cdot\vec e_2 = \vec e^1\cdot\vec e_3 = \vec e^2\cdot\vec e_3 = \vec e^2\cdot\vec e_1 = \vec e^3\cdot\vec e_1 = \vec e^3\cdot\vec e_2 = 0. \tag{Recip} $$ (Here for $v^i$ and $\vec e^i$ we are using superscripts as indices, and $\vec e^i$ is a separate variable from $\vec e_i$; we do this because it leads to very suggestive notation.) Then easily we have $$ \vec e^1\cdot\vec v = v^1,\quad \vec e^2\cdot\vec v = v^2,\quad \vec e^3\cdot\vec v = v^3. $$ The equations (Recip) are always uniquely solvable since $\{\vec e_1, \vec e_2, \vec e_3\}$ is a basis; they yield the reciprocal basis $\{\vec e^1, \vec e^2, \vec e^3\}$. Notice that the reciprocal of $\{\vec e^1, \vec e^2, \vec e^3\}$ is $\{\vec e_1, \vec e_2, \vec e_3\}$ so that $$ \vec v = v_1\vec e^1 + v_2\vec e^2 + v_3\vec e^3 \implies \vec e_i\cdot\vec v = v_i. \tag{$*$} $$

Now suppose we parameterize $\vec x$ with coordinates, yielding a function $\vec x(x^1, x^2, x^3)$. There are two ways that we can naturally derive a basis from these coordinates.

Associate to $x^i$ a vector which is orthogonal to the surface $x^i = a^i$ at the point $\vec x(a^1, a^2, a^3)$ and which points in the direction of increasing $x^1$. The basic theory of gradients tells us that $\vec e^i = \nabla x^i$ is such a vector.
At the point $\vec x(a^1, a^2, a^3)$, associate to e.g. $x^1$ a vector tangent to the line of intersection of the surfaces $x^2 = a^2$ and $x^3 = a^3$; because $x^2, x^3$ are fixed on this line, only $x^1$ varies, so $\vec e_1 = \pd{\vec x}{x^1}$ is such a vector. Similarly obtain $\vec e_i = \pd{\vec x}{x^i}$. Note also that this is the directional derivative along $\vec e_i$; in fact from (Triv) we get the consistency equation $$ \PD{\vec x}{x^i} = (\vec e_i\cdot\nabla)\vec x = \vec e_i. $$ The equality $\pd{}{x^i} = \vec e_i\cdot\nabla$ says that under (Iso) $\vec e_i$ is in correspondence with $\pd{}{x^i}$.

The basis $\{\vec e_1, \vec e_2, \vec e_3\}$ is called the coordinate or holonomic basis, and the above notations $\vec e_i$ and $\vec e^i$ are very intentional as the above definitions make clear that these bases are reciprocal. It's often easier to find the coordinate basis and then find its reciprocal instead of trying to calculate $\nabla x^i$ directly.

If we want $\nabla$ to be vector-like, we ought to have an expansion like ($*$): $$\begin{aligned} \nabla &= \vec e^1(\vec e_1\cdot\nabla) + \vec e^2(\vec e_2\cdot\nabla) + \vec e^3(\vec e_3\cdot\nabla) \\ &= \vec e^1\PD{}{x^1} + \vec e^2\PD{}{x^2} + \vec e^3\PD{}{x^3}. \end{aligned}$$

Let $\{\vec e_x, \vec e_y, \vec e_z\}$ be the standard basis; this is also the coordinate basis for Cartesian coordinates $(x, y, z)$ at all points and is its own reciprocal, hence why $$ \nabla = \vec e_x\PD{}x + \vec e_y\PD{}y + \vec e_z\PD{}z. $$ Spherical coordinates $(r, \theta, \phi)$ are related to Cartesian coordinates by $$ x = r\sin\theta\cos\phi,\quad y = r\sin\theta\sin\phi,\quad z = r\cos\theta. \tag{Sph} $$ Since the Cartesian position vector is $\vec x(x,y,z) = x\vec e_x + y\vec e_y + z\vec e_z$, the spherical position vector is $$ \vec x(r,\theta,\phi) = r(\vec e_x\sin\theta\cos\phi + \vec e_y\sin\theta\sin\phi + \vec e_z\cos\theta). $$ We can now find the spherical coordinate basis $$ \vec e_r(r,\theta,\phi) = \PD{\vec x}r = \vec e_x\sin\theta\cos\phi + \vec e_y\sin\theta\sin\phi + \vec e_z\cos\theta, $$ $$ \vec e_\theta(r,\theta,\phi) = \PD{\vec x}\theta = r(\vec e_x\cos\theta\cos\phi + \vec e_y\cos\theta\sin\phi - \vec e_z\sin\theta), $$ $$ \vec e_\phi(r,\theta,\phi) = \PD{\vec x}\phi = r\sin\theta\,(-\vec e_x\sin\phi + \vec e_y\cos\phi). $$ You can compute that this basis is orthogonal; this means that its reciprocal is found just by taking the reciprocal of the magnitudes: $$ \vec e^r(r,\theta,\phi) = \frac{\vec e_r}{|\vec e_r|^2} = \vec e_r(r,\theta,\phi), $$ $$ \vec e^\theta(r,\theta,\phi) = \frac{\vec e_\theta}{|\vec e_\theta|^2} = \frac1{r^2}\vec e_\theta(r,\theta,\phi), $$ $$ \vec e^\phi(r,\theta,\phi) = \frac{\vec e_\phi}{|\vec e_\phi|^2} = \frac1{r^2\sin^2\theta}\vec e_\phi(r,\theta,\phi). $$ Especially when working with orthonormal bases like this, people will use the unit coordinate basis: $$ \hat{\vec e}_r = \frac{\vec e_r}{|\vec e_r|} = \vec e_x\sin\theta\cos\phi + \vec e_y\sin\theta\sin\phi + \vec e_z\cos\theta, $$ $$ \hat{\vec e}_\theta = \frac{\vec e_\theta}{|\vec e_\theta|} = \vec e_x\cos\theta\cos\phi + \vec e_y\cos\theta\sin\phi - \vec e_z\sin\theta, $$ $$ \hat{\vec e}_\phi = \frac{\vec e_\phi}{|\vec e_\phi|} = -\vec e_x\sin\phi + \vec e_y\cos\phi, $$ and in terms of this we have $$ \vec e^r = \hat{\vec e}_r,\quad \vec e^\theta = \frac1r\hat{\vec e}_\theta,\quad \vec e^\phi = \frac1{r\sin\theta}\hat{\vec e}_\phi. $$ Finally, we may write $\nabla$ in spherical coordinates: $$ \nabla = \vec e^r\PD{}r + \vec e^\theta\PD{}\theta + \vec e^\phi\PD{}\phi = \hat{\vec e}_r\PD{}r + \hat{\vec e}_\theta\frac1r\PD{}\theta + \hat{\vec e}_\phi\frac1{r\sin\theta}\PD{}\phi. $$

For fun, lets calculate the reciprocal coordinate basis directly.

First, start with the equation $$ r^2 = x^2 + y^2 + z^2 $$ and apply $\nabla$ to each side in order to get $$ 2r\nabla r = 2(x\nabla x + y\nabla y + z\nabla z). $$ $\{\nabla x, \nabla y, \nabla z\}$ is the reciprocal Cartesian coordinate basis, which is just the standard basis; hence $$ \nabla r = \frac1r(x\vec e_x + y\vec e_y + z\vec e_z) $$ and applying (Sph) we get the desired result.

Next, consider that $$ r\cos\theta = z, $$ and apply $\nabla$ to both sides to get $$ \cos\theta\,\nabla r - r\sin\theta\,\nabla\theta = \nabla z $$ $$\begin{aligned} \implies \nabla\theta &= \frac1{r\sin\theta}(\cos\theta\,\vec e^r - \vec e_z) \\ &= \frac1{r\sin\theta}( \vec e_x\cos\theta\sin\theta\cos\phi + \vec e_y\cos\theta\sin\theta\sin\phi - \vec e_z\sin^2\theta ) \\ &= \frac1r( \vec e_x\cos\theta\cos\phi + \vec e_y\cos\theta\sin\phi - \vec e_z\sin\theta ). \end{aligned}$$

Finally, from the equation $$ \tan\phi = \frac yx $$ and applying $\nabla$ we get $$ \sec^2\phi\,\nabla\phi = \frac1x\nabla y - \frac y{x^2}\nabla x = \frac1r\csc\theta\sec\phi\,(\vec e_y - \vec e_x\tan\phi) $$ $$ \implies \nabla\phi = \frac1{r\sin\theta}(\vec e_y\cos\phi - \vec e_x\sin\phi). $$

Deriving the gradient operator in spherical coordinates

2 Answers2

Linked