3

I'm TA'ing multivariable this semester, and I just noticed that we always tend to normalize all our basis vectors when using polar coordinates. This is in stark contrast what I'm used to in differential geometry, as we'd prefer that our coordinate basis to transform by the law \begin{align*} \frac{\partial}{\partial x}&=\frac{\partial r}{\partial x}\frac{\partial}{\partial r}+\frac{\partial \theta}{\partial x}\frac{\partial}{\partial \theta}\\ &=\cos\theta\frac{\partial}{\partial r}-\frac{\sin\theta}{r}\frac{\partial}{\partial \theta}, \end{align*} and similarly with $\frac{\partial}{\partial y}$. This $\frac{1}{r}$ factor makes up for the fact that if we travel in the angular direction, we cover more ground the further we are from the origin. So for example, the gradient in these "geometric" polar coordinates would take on the form $$\nabla f |_{(r,\theta)}=(\frac{\partial}{\partial r},\frac{1}{r^2}\frac{\partial}{\partial \theta})$$ which agrees with the usual way of defining gradients locally by $\nabla f=g^{ij}(\partial_if)\partial_j$. This in opposition to the more common $\frac{1}{r}$ factor which comes using the normalized polar coordinate system. So why are we normalizing these coordinates? If you insist on working in an orthonormal frame, why not call it a polar frame instead of polar coordinates to avoid bad practices in the future?

Edit: Let me put in an explicit computation in with the "geometric" (which I learned is called holonomic) basis. Consider $$f(x,y)=\frac{x}{x^2+y^2},$$ so that in polar coordinates, $$f(r,\theta)=\frac{\cos\theta}{r}.$$ One sees: \begin{align*} \nabla f&=\frac{\partial f}{\partial x}\bigg\vert_{(r,\theta)}\frac{\partial}{\partial x}+\frac{\partial f}{\partial y}\bigg\vert_{(r,\theta)}\frac{\partial}{\partial y}\\ &=\frac{\sin^2\theta-\cos^2\theta}{r^2}\left(\cos\theta\frac{\partial}{\partial r}-\frac{\sin\theta}{r}\frac{\partial}{\partial \theta}\right)-\frac{2\cos\theta\sin\theta}{r^2}\left(\sin\theta\frac{\partial}{\partial r}+\frac{\cos\theta}{r}\frac{\partial}{\partial \theta}\right)\\ &=\frac{\sin^2\theta\cos\theta-\cos^3\theta-2\cos\theta\sin^2\theta}{r^2}\frac{\partial}{\partial r}+\frac{-\sin^3\theta+\cos^2\theta\sin\theta-2\cos^2\theta\sin\theta}{r^3}\frac{\partial}{\partial \theta}\\ &=-\frac{\cos\theta}{r^2}\frac{\partial}{\partial r}-\frac{\sin\theta}{r^3}\frac{\partial}{\partial \theta}\\ &=\frac{\partial f}{\partial r}\frac{\partial }{\partial r}+\frac{1}{r^2}\frac{\partial f}{\partial \theta}\frac{\partial}{\partial \theta}. \end{align*}

Mr. Brown
  • 1,867
  • 1
    The $\theta$-component of the gradient in polar coordinates has a $\frac1r$ scale factor, not a $\frac1{r^2}$ scale factor. – Mark Viola Sep 26 '22 at 14:46
  • @MarkViola yes, if you use the normalized basis vector fields instead of the holonomic ones. OP adresses this. – Jackozee Hakkiuz Sep 26 '22 at 20:30
  • @JackozeeHakkiuz Why on earth are you addressing my comment when Kurt G. had two comments that effectively align with mine? Moreover, define "geometric" ones. Are you suggesting that the basis vectors are $\hat r$ and $r \hat \theta$ in these "geometric" ones? If this is the case, then so what? The question is in regards to semantics (i.e., nomenclature) only. – Mark Viola Sep 26 '22 at 20:37
  • Hi, @MarkViola. The holonomic basis vector fields are more "geometric" in the sense that they come naturally as the pushforward of the Euclidean basis vector fields along the inverse of the chart map. You might find more about it here. – Jackozee Hakkiuz Sep 26 '22 at 20:46
  • Also, the question is not about nomenclature, that is just a final note. The core of the question is why do a lot of people prefer the normalized basis vector fields instead of the holonomic basis vector fields, which are naturally given by the coordinates themselves. – Jackozee Hakkiuz Sep 26 '22 at 20:51
  • @JackozeeHakkiuz The QUESTIONS are "So why are we normalizing these coordinates? If you insist on working in an orthonormal frame, why not call it a polar frame instead of polar coordinates to avoid bad practices in the future?" These are about semantics. And you never answer my question. Why are you singling my comment out? – Mark Viola Sep 26 '22 at 20:58
  • @JackozeeHakkiuz Normalized basis vectors (i.e., Unit Vectors) are used pervasively in physics and engineering. I contend that they "come more naturally" in vector calculus in curvilienar coordinates. – Mark Viola Sep 26 '22 at 21:01
  • @MarkViola I mean most of the original question is devoted to the choice of basis vectors, and the final line about semantics just suggests a way to clearly distinguish between the choice of basis vectors. Regarding your question, (which I didn't answer at the beggining because I don't think is about maths, but you insisit that I answer so I do), I simply haven't gone through Kurt's comments in detail to see whether I agree or not with his statements. Your comment is about something that is adressed in the OP. I suggest you don't take it personally. – Jackozee Hakkiuz Sep 26 '22 at 21:10
  • Hi Mark, Jackozee has a more accurate interpretation to the spirit of my original question. My last sentence honestly is to blow off steam at bad naming practices that has personally bit me as a student of DG. I'm really interested in seeing example of exactly where and why normalized basis vectors come up more naturally in physics or engineering. I'd wager 9 out of 10 differential geometers would recommend that $\frac{\partial}{\partial r}$ remain unnormalized. – Mr. Brown Sep 26 '22 at 23:34
  • @KurtG. Regarding your second comment, the vector $(1,1/r)$ with respect to the holonomic basis has length $\sqrt 2$. I guess you meant to say that $(0,1/r)$ has length $1$, which is correct and also says that $(0,1)$ has length $r$, so the polar basis vector is not unit length. OP's question correctly points out that formulas such as $ds^2=dr^2+r^2d\theta^2$ only work if one uses the holonomic basis instead of the normalized one. – Jackozee Hakkiuz Sep 27 '22 at 19:11
  • 2
    @KurtG. OP is using the holonomic basis $\partial_r, \partial_\theta$. What he is saying is that when you express the gradient vector $$\vec\nabla f=(\partial_x f) \partial_x + (\partial_y f) \partial_y$$ in the holonomic basis $\partial_r, \partial_\theta$ you obtain $$\vec\nabla f=(\partial_r f) \partial_r+ \frac{1}{r^2}(\partial_\theta f) \partial_\theta$$. – Jackozee Hakkiuz Sep 27 '22 at 19:59
  • 1
    Re-expressing this with respect to the orthonormal basis $e_r=\partial_r, e_\theta=\frac{1}{r}\partial_\theta$ we of course get the gradient formula from the wiki $$\vec\nabla f=(\partial_r f) e_r+ \frac{1}{r}(\partial_\theta f) e_\theta$$ – Jackozee Hakkiuz Sep 27 '22 at 20:09
  • 1
    @JackozeeHakkiuz . I believe the comment section is getting too small for a discussion. Please see my answer. – Kurt G. Sep 28 '22 at 07:40
  • @ZackFox Here is another recent question from a user who didn't know they needed to use the holonomic basis in order for the formulas to work out correctly. – Jackozee Hakkiuz Oct 03 '22 at 15:05
  • 1
    @JackozeeHakkiuz yet another comrade who got bit by bad pedagogy – Mr. Brown Oct 06 '22 at 02:46

2 Answers2

4

I am grateful to @JackozeeHakkiuz who made the comment that your question has to do with holonomic / non holonomic coordinates. This PSE post also treats the subject with the example of 2D polar coordinates:

  • $\{\partial_r,\partial_\theta\}=\{\hat{\boldsymbol{e}}_r,r\hat{\boldsymbol{e}}_\theta\}$ is the holonomic basis which is orthogonal but not orthonormal.

  • In contrast, $\{\partial_r,\frac{1}{r}\partial_\theta\}= \{\hat{\boldsymbol{e}}_r,\hat{\boldsymbol{e}}_\theta\}$ is not holonomic but orthonormal.

  • Clearly, the holonomic operators commute while the non holonomic ones don't.

  • The metric that makes the above mentioned orthogonality / non-orthogonality happend is (in both cases) the familiar $ds^2=dr^2+r^2\,d\theta^2$ which we can write as the $2$-tensor $$ g=\begin{pmatrix}1&0\\0&r^2\end{pmatrix}\,. $$ Note that $$ g(\hat{\boldsymbol{e}}_\theta,\hat{\boldsymbol{e}}_\theta)= g\Big(\frac{1}{r}\partial_\theta,\frac{1}{r}\partial_\theta\Big)= r^2\frac{1}{r^2}=1\,. $$

  • If we change to a metric that looks Euclidean $$ \eta=\begin{pmatrix}1&0\\0&1\end{pmatrix}\, $$ then the holonomic basis becomes (miraculously) orthonormal. However, the scaling is now built into the basis vector $r\hat{\boldsymbol{e}}_\theta$ which -as we know- is not normalized in the traditional sense using $g\,.$ To me this seems like an easy way to understand what tetrads (moving coordinate systems) are.

  • The factor $\displaystyle \frac{1}{r^2}$ caused a lot of confusion in the comments (including mine) but @JackozeeHakkiuz finally cleared this up: For a differentiable function $f$, the $\mathbb R^2$-valued function $$ \Big(\partial_rf,\frac{1}{r}\partial_\theta f\Big) $$ are the components of its gradient (in orthogonal polar coordinates). Using the basis vector fields $\{\partial_r,\frac{1}{r}\partial_\theta\}$ we get the gradient vector field as $$ (\partial_rf)\,\partial_r + \Big(\frac{1}{r}\partial_\theta f\Big)\frac{1}{r}\partial_\theta=(\partial_rf)\,\partial_r + \Big(\frac{1}{r^2}\partial_\theta f\Big)\,\partial_\theta\,. $$ In other words, in the holonomic basis $\{\partial_r,\partial_\theta\}$ the second component carries the factor $\displaystyle\frac{1}{r^2}\,.$

  • The linked PSE post also gives the example of a closed loop out of which I created the following picture. It shows that if we perform a "coordinate loop" starting in $(r,\theta)=(1,0)$ and incrementing these coordinates by the same amounts $\Delta r=\pm 1$ resp. $\Delta\theta=\pm\frac{\pi}{2}$ the loop will be closed provided at $r=2$ the basis vector is scaled up to the holonomic $r\hat{\boldsymbol{e}}_\theta$. The closedness of the loop is clearly related to the vanishing of the commutator $$ [\hat{\boldsymbol{e}}_r,r\hat{\boldsymbol{e}}_\theta]=[\partial_r,\partial_\theta]\,. $$ That order independence (closedness of the loop) must also be the reason why the holonomic basis is called a coordinate basis.

enter image description here

Kurt G.
  • 17,136
  • Hi Kurt, I've updated the original post with an explicit computation which makes the factor of $1/r^2$ pop out. Hope it helps! – Mr. Brown Sep 28 '22 at 21:56
  • @KurtG. Yes, the term $1/r^2$ does appear, and the proof is Zack's calculation. He applied the chain rule correctly, although he could have made it for an arbitrary function $f$. You can check it yourself. It is a well known fact from differential geometry that gradient vector of a function $f$ is $\vec\nabla f = \sum_{ij} g^{ij}(\partial_if)\partial_j$, and this coincides with $\vec\nabla f = (\partial_r f) \partial_r + \frac{1}{r^2}(\partial_\theta f)\partial_\theta$ in the case of polar coordinates since $g^{11}=1$ and $g^{22}=\frac{1}{r^2}$. – Jackozee Hakkiuz Sep 29 '22 at 05:03
  • 1
    You can also deduce it going backwards from my last two comments in the main post. From the expression in the orthonormal basis $$\vec\nabla f=(\partial_r f) e_r+ \frac{1}{r}(\partial_\theta f) e_\theta$$ (which I think we all agree with), substitute $e_r=\partial_r$ and $e_\theta=\frac{1}{r}\partial_\theta$, so that you obtain $$\vec\nabla f=(\partial_r f) \partial_r+ \frac{1}{r^2}(\partial_\theta f) \partial_\theta.$$ – Jackozee Hakkiuz Sep 29 '22 at 05:22
  • @Kurt G. Here is another recent question from a user who didn't know they needed to use the holonomic basis in order for the formulas to work out correctly. – Jackozee Hakkiuz Oct 03 '22 at 15:05
0

My naïve guess is simply that people like orthonormal bases so that they can apply a Pythagoras-like formula to get lengths $$\|a e_r + b e_\theta \| =\sqrt{a^2+b^2}$$ and so they avoid introducing the components of the metric tensor. Projection formulas are also easier, like the component of $v$ in the angular direction is $(v\cdot e_\theta)e_\theta$. This last point can be remedied by using the dual basis of the dual space, but again most people avoid talking about linear functionals by using their Riesz representatives.

Finally, the vector calculus formalism is not designed to work nicely in arbitrary coordinates (an example of this is how the vector Laplacian has to be defined using the curl of the curl). People using vector calculus tend to not care about coordinate invariance of their expressions, so they are happy (or at least won't complain as much as a differential geometer would) having different expressions in different coordinate systems.

Jackozee Hakkiuz
  • 6,119
  • 1
  • 17
  • 38