5

From Geodesic on Stiefel manifold, a geodesic (under the canonical metric) on the manifold of orthogonal matrices can be expressed as

$$Y(t) = Q e^{Xt} I$$

for some matrices $Q$ and $X$.

  1. Does it follow that the geodesic between two orthogonal matrices $Y_1$ and $Y_2$ is given by $Y(t) = Y_1 \left( Y^\top_2 Y_1 \right)^{-t}$, for $t \in [0,1]$?

  2. Does this curve have constant speed?

1 Answers1

6

First, you have to restrict your attention to $\operatorname{SO}(n)$ (that is, the group of orthogonal matrices with determinant $+1$), because otherwise not all pairs of matrices in the group can be connected by a path.

With that understood, there are two interrelated problems with your proposed formula.

First, in general, "the geodesic between two orthogonal matrices $Y_1$ and $Y_2$" does not make sense, because there will in general be many different geodesics between the same two matrices. (Think of the case of $\operatorname{SO}(2)$, the circle group, in which any two points are connected by a sequence of geodesics with increasing lengths.)

Even if you restrict attention to minimizing geodesics (ones that are the shortest paths between their endpoints), there is not always a unique such geodesic. Again the circle group provides an example: between any two points that are diametrically opposite each other, there are two distinct minimizing geodesics.

The second problem is how to interpret the expression $(Y^\top_2 Y_1)^{-t}$ when $t$ is an arbitrary real number. In general, such an expression is defined to mean $\exp (-t \log(Y^\top_2 Y_1))$. But the matrix $(Y^\top_2 Y_1)$ will not in general have a unique logarithm, so we have to figure out how to interpret that expression.

What is true is that if $Y_1$ and $Y_2$ are sufficiently close to each other, then there will be a unique minimizing geodesic joining them, and that geodesic will be given by the expression $Y(t) = Y_1 \exp (-tX)$, where $X$ is the smallest skew-symmetric matrix such that $\exp(X) = Y^\top_2 Y_1$. If you interpret $\log(M)$ when $M$ is close to the identity to mean the smallest matrix $X$ such that $\exp(X)= M$, and $M^{-t}$ to mean $\exp(-t\log(M))$, then your formula is valid.

EDIT:

Let me address the questions you raised in your comments. You can find justifications for most of these claims in my book Introduction to Riemannian Manifolds (2nd ed.).

(1) is it true that any geodesic in the manifold of orthogonal matrices can be written in the form $Y(t) = Q \exp(-tX)$ for some skew-symmetric matrix and some orthogonal matrix $Q$?

Yes. Because the canonical metric is bi-invariant, the geodesics starting at the identity are exactly the one-parameter subgroups, which are the curves of the form $Y(t) = \exp(tX)$ for $X$ in the Lie algebra of $O(n)$. Then, because the metric is left-invariant, the geodesics starting at $Q$ are exactly the left translates of those starting at $I$.

(2) is there some simple formula for ONE possible minimum distance geodesic between two orthogonal matrices $Y_1$ and $Y_2$ that have determinant $+1$?

Probably not in the form you're hoping for, because such a formula cannot be continuous everywhere -- for example, in $\operatorname{SO}(2)$, if you fix $Y_1$ and let $Y_2$ move across the antipodal point, the minimizing geodesic suddenly switches from clockwise to counterclockwise (or vice versa).

However, there are sets $\operatorname{C}(I)\subset \operatorname{SO}(n)$ called the cut locus of $I$ and $\operatorname{ID}(I)\subset T_I \operatorname{SO}(n)$ called the injectivity domain of $I$, with the following properties: (1) $\operatorname{ID}(I)$ is open and star-shaped with respect to $0$; (2) $\operatorname{SO}(n) \smallsetminus \operatorname{C}(I)$ is an open dense subset containing $I$; (3) $\exp: \operatorname{ID}(I)\to \operatorname{SO}(n) \smallsetminus \operatorname{C}(I)$ is a diffeomorphism; and (4) for each $M\in \operatorname{SO}(n) \smallsetminus \operatorname{C}(I)$ there is a unique $X\in \operatorname{ID}(I)$ such that $\exp X = M$ (which we can write as $X = \log M$); and (5) for each such $M$, the curve $Y(t) = \exp (t \log M)$ for $t\in [0,1]$ is the unique minimizing geodesic from $I$ to $M$. It then follows that as long as $Y_1^\top Y_2\in \operatorname{SO}(n) \smallsetminus \operatorname{C}(I)$, the curve $Y(t) = Y_1 \exp (t \log Y_1^\top Y_2)$ is the unique minimizing geodesic from $Y_1$ to $Y_2$.

(3) are the expressions above (and the one I suggested) for constant speed geodesics?

Yes. When I say "geodesic," I mean a curve that satisfies the geodesic equation, and these are all automatically constant-speed curves.

(4) what is the reason behind the "smallest skew-symmetric matrix"? Why does X need to be small for the geodesic to minimize distance? Is it because we want the integral of the velocity over time (which is proportional to X) to be small?

The speed of a curve $t \mapsto \exp (tX)$ is exactly $|X|$ (where the norm is defined by $|X|^2 = \operatorname{trace}(X^\top X)$), and thus the length of the curve with parameter interval $[0,1]$ is exactly $|X|$.

Jack Lee
  • 50,850
  • Thanks! Just to make sure I understand. (1) is it true that any geodesic in the manifold of orthogonal matrices can be written in the form Y(t) = Q exp(-tX) for some skew-symmetric matrix and some orthogonal matrix Q? (2) is there some simple formula for ONE possible minimum distance geodesic between two orthogonal matrices Y1 and Y2 that have determinant +1? (3) are the expressions above (and the one I suggested) for constant speed geodesics? – opt_learn Jun 18 '19 at 20:22
  • (4) what is the reason behind the "smallest skew-symmetric matrix"? Why does X need to be small for the geodesic to minimize distance? Is it because we want the integral of the velocity over time (which is proportional to X) to be small? Thanks again and appreciate your patience with my many questions ! – opt_learn Jun 18 '19 at 20:31
  • Hello again, thanks for the extra explanations and the book, which looks really nice. Regarding "not all pairs of matrices in the group can be connected by a path", the obvious example that comes to mind is Y1 = 1 and Y2 = -1. Now, does this mean that the set of orthogonal matrices do not form a smooth manifold? If instead with work with unitary matrices, do things get simpler, in the sense that now all matrices can be connected by a smooth path? – opt_learn Jun 19 '19 at 19:15
  • 1
    @opt_learn: No, it doesn't mean that the set of orthogonal matrices isn't a smooth manifold -- it just means that it's not a connected smooth manifold. And yes, the group of unitary $n\times n$ matrices is connected, so this problem doesn't arise. – Jack Lee Jun 20 '19 at 19:38
  • Very interesting (+1). Does this generalize naturally to $U(n)$ (or $SU(n)$ perhaps)? – lcv May 03 '22 at 15:45
  • 1
    @lcv: Yes, it generalizes to any connected Lie group with a bi-invariant metric (with suitable modifications, such as "element of the Lie algebra" in place of "skew-symmetric matrix"). Since every compact Lie group has a bi-invariant metric, it applies to $U(n)$ and $SU(n)$. – Jack Lee May 03 '22 at 17:08
  • Thank you! (and I have to add some characters) – lcv May 03 '22 at 17:15
  • I just found this answer and realized it is adressing a very similar question to the one I have. However, I consider a symmetric indefinite (potentially degenerate) bilinear form since I take $O(p,q,r,\mathbb{R})$. I am hoping you could have a look and suggest ideas or references I can look into, here is the question: https://math.stackexchange.com/questions/4916531/interpolation-in-op-q-r-mathbbr – lightxbulb May 18 '24 at 22:45