Geometry of the Cayley Transform

Question

I'm trying to understand the geometry of the Cayley transform. Suppose I have a $3 \times 3$ rotation matrix $R$ (i.e an orthogonal matrix with determinant equal to $1$). Let's ignore the corner case where $-1$ is an eigenvalue of $R$ (in other words, we assume that the rotation angle is not $\pi$). Then, according to a result of Cayley, I can find a skew symmetric matrix $S$ such that $$ R = (I - S)(I + S)^{-1} $$ In other words, I can find two other transformations $A = I - S$ and $B= (I + S)^{-1}$ whose combined effect, when applied one after the other, is the same as the original rotation.

My question is:
Can we find some geometric interpretation of the transforms $S$ and $A$ and $B$, so that we can see how they combine to produce a rotation.

I know that a rotation can be written as a product of two reflections. Is that related to the Cayley decomposition $R = AB$? Are $A$ and $B$ reflections?

The 3-dimensional case is the only one that's of interest to me.

Edit: Some Progress

I made some progress on the algebra, but not the geometry. Suppose our matrix $R$ corresponds to a rotation through an angle $\theta$ around the unit vector $\mathbf{n} = (u,v,w)$. Let $t = \tan\tfrac12\theta$. Then I managed to show that the Cayley decomposition is given by $R = A \cdot B$, where $$ S = \left[ \begin{matrix} 0 & t w & -t v \\ -t w & 0 & t u \\ t v & -t u & 0 \end{matrix} \right] $$ $$ A = I - S = \left[ \begin{matrix} 1 & -t w & t v \\ t w & 1 & -t u \\ -t v & t u & 1 \end{matrix} \right] $$ $$ B = (I + S)^{-1} = \frac{1}{1+t^2} \left[ \begin{matrix} t^2 u^2+1 & t (t u v-w) & t (v+t u w) \\ t (t u v+w) & t^2 v^2+1 & t (t v w-u) \\ t (t u w-v) & t (u+t v w) & t^2 w^2+1 \end{matrix} \right] $$ We have $\det(A) = 1+ t^2$ and $\det(B) = 1/(1+t^2)$, so neither $A$ nor $B$ is a rotation or a reflection.

I still don't see the geometry of $A$ and $B$, though. That's the puzzle.

The determinant of $I\pm S$, for $3\times 3$ skew-symmetric $S\neq 0$, is strictly greater than $1$, so $A$ and $B$ are definitely not reflections. — Blue, Mar 09 '14 at 19:30
Good. Thanks. I didn't think they were reflections, actually, but your argument gives a nice simple proof of this. — bubba, Mar 10 '14 at 02:38
I think you're very close to what I've noticed: $A$ and $B$ are scaled rotations. The $I + S$ transformation rotates about $\mathbf{n}$ by $\theta/2$, and it scales a point's distance from the axis by $1+t^2$. Likewise, the $I-S$ transformation rotates about the same axis, with the same scale factor, but in the opposite direction; the inverse of $I-S$, therefore, compounds the $I+S$ rotation (for a full turn of $\theta$), and cancels the $I+S$ scaling. — Blue, Mar 10 '14 at 07:50
Hi Blue. Nice result. Very tidy. But I don't know how you concluded that $I - S$ and $I + S$ are "scaled rotations". Can you elaborate a bit, please. — bubba, Mar 10 '14 at 10:21

tom · Answer 1 · 2014-03-11T08:11:26.693

Actually we do not need quaternions, because we are working only with one rotation so we can assume that $R$ is rotation around z-axis. We can restrict ourselfs only to xy-plane. Rotations in 2d can be expressed by unit complex numbers and skew-symmetric matrices correspond to pure imaginary numbers.

Cayley transformation for skew-symmetric matrices: $$ \phi:S \longmapsto (I-S)(I+S)^{-1} $$ can be understood through Cayley transformation on complex plane: $$ \psi:i b \longmapsto \frac{1-ib}{1+ib} $$

Thus if you want to know what $I-S$ does you only need to know what does $1-ib$ to complex plane.

edit - reverse answer I answered you question in "reversed" way too. But It doesn't matter because Cayley transformation $\psi$ from $i\mathbb{R}\cup \{\infty\}$ to $S^1$ is bijection. So for any rotation $e^{i \theta}$ there exists $\psi^{-1}(e^{i \theta})$.

Actually my answer nicely scales to arbitrarily dimensions. By spectral theorem for skew-symmetric matrices, you can transform to some basis where your matrix takes form:

$$ \begin{bmatrix} \begin{matrix}0 & \lambda_1\\ -\lambda_1 & 0\end{matrix} & 0 & \cdots & 0 \\ 0 & \begin{matrix}0 & \lambda_2\\ -\lambda_2 & 0\end{matrix} & & 0 \\ \vdots & & \ddots & \vdots \\ 0 & 0 & \cdots & \begin{matrix}0 & \lambda_r\\ -\lambda_r & 0\end{matrix} \\ & & & & \begin{matrix}0 \\ & \ddots \\ & & 0 \end{matrix} \end{bmatrix} $$

And than study each two dimensional subspace associated with block $\left[ \begin{matrix}0 & \lambda_r\\ -\lambda_r & 0\end{matrix}\right]$ with complex numbers.

@tom, let $S=\begin{pmatrix}0&z&-y\-z&0&x\y&-x&0\end{pmatrix}$ and let $\Pi$ be the orthogonal projection on the orthogonal $F$ of $[x,y,z]^T$. You use the fact that $\Pi$ and $Rot(\theta,[x,y,z]^T)$ commute. Then you must show that if $w\in F$, then $Sw\in F$, that is easy. — , Mar 11 '14 at 03:32
@loupblanc That is easy because $Sv = [x,y,z]^T \times v$. $\times$ is cross product. — tom, Mar 11 '14 at 08:14

bubba · Accepted Answer · 2020-03-18T00:53:26.933

The axis of rotation will be a line through the origin in the direction of a unit vector $N$. Let $R = R(\theta)$ be the transformation that performs rotation by an angle $\theta$ around this axis. By Rodrigues formula, we know that $$ R(\theta) = (\cos\theta)I + (1 - \cos\theta)NN^T + (\sin\theta)\tilde{N} $$ where $\tilde{N}$ denotes the matrix that performs cross products with $N$, so that $\tilde{N}V = N \times V$ for any vector $V$.

Cayley's result tells us that $S = (I - R)(I + R)^{-1}$, which gives $$ S = -(\tan\tfrac12\theta)\tilde{N} $$ $$ A = I - S = I + (\tan\tfrac12\theta)\tilde{N} $$ $$ B = \tfrac12\big\{ I + NN^T + (\cos\theta)(I - NN^T) + (\sin\theta)\tilde{N} \big\} $$

Let $K(\lambda)$ be the matrix that performs radial scaling around our axis line. In other words multiplying by $K(\lambda)$ scales the distance of a point from the axis by a factor $\lambda$. It is easy to show that $$ K(\lambda) = \lambda I + (1-\lambda)NN^T $$ Then, straightforward calculations show that $$ A = R(\tfrac12\theta)K(\sec\tfrac12\theta) \quad ; \quad B = R(\tfrac12\theta)K(\cos\tfrac12\theta) $$ So, the geometric effects of $A$ and $B$ are:

$A$ performs rotation by $\tfrac12\theta$ together with radial scaling by a factor $\sec\tfrac12\theta$
$B$ performs rotation by $\tfrac12\theta$ together with radial scaling by a factor $\cos\tfrac12\theta$

The combined effect $AB$ is just rotation by $\theta$, since the two scalings cancel each other out.

Nice answer, +1. However, the scaling is not axial (i.e. along the axis), but planar. Both $A$ and $B$ leave the axial component invariant. — user1551, Apr 16 '14 at 18:24
Bad use of the word "axial", I guess. I meant scaling away from the axis, rather than scaling along the axis. This usage is mentioned in my definition of $K(\lambda)$. — bubba, Apr 17 '14 at 06:25

score 2 · Answer 3 · answered Mar 10 '14 at 19:18

Expanding on my comment ...

Write $$S = \left[\begin{matrix} 0 & r & -q \\ -r & 0 & p \\ q & -p & 0 \end{matrix}\right] \qquad M = I + S = \left[\begin{matrix} 1 & r & - q \\ -r & 1 & p \\ q & -p & 1 \end{matrix}\right] = ( I - S )^\top = N^\top$$

Note that $M$ (and $N$) fix the unit vector $\mathbf{p} := \frac{1}{s}(p,q,r)$, where $s^2 = p^2 + q^2 + r^2$.

Let $R$ be the reflection, through a plane containing the origin, that exchanges $\mathbf{z} := [0,0,1]^\top$ and $\mathbf{p}$. The normal to the plane is $\mathbf{z} - \mathbf{p}$, and we can compute the matrix as $$R = \frac{1}{s(r-s)}\left[\begin{matrix} r s - q^2 - r^2 & p q & p ( r - s ) \\ p q & r s - p^2 - r^2 & q ( r - s ) \\ p(r-s) & q ( r - s ) & r ( r - s ) \end{matrix}\right]$$

Then we have $$\widehat{M} := R^{-1} M R = R M R = \left[\begin{matrix} 1 & -s & 0 \\ s & 1 & 0 \\ 0 & 0 & 1 \end{matrix}\right]$$ such that $$\widehat{M} \left[\begin{matrix} a \cos\alpha \\ a \sin\alpha \\ b \end{matrix}\right] = \left[\begin{matrix} a ( \cos\alpha - s \sin\alpha ) \\ a ( s \cos\alpha + \sin\alpha ) \\ b \end{matrix}\right] = \left[\begin{matrix} a t \; \cos(\alpha+\theta) \\ a t \; \sin(\alpha+\theta) \\ b \end{matrix}\right] \qquad (*)$$ where $t^2 = 1 + s^2 = 1 + p^2 + q^2 + r^2$ and $\tan\theta = \frac{s}{1} = \sqrt{p^2+q^2+r^2}$.

The matrix $\widehat{M}$ represents the transformation that reflects the $z$-axis onto the fixed line of the transformation $M$, then applies transformation $M$, then reflects the fixed line back to the $z$-axis. As $(\star)$ indicates, if a point lies on a cylinder (of radius $a$) whose axis aligns with the $z$-axis, then the combined transformation moves that point to a $z$-aligned cylinder of radius $at$, but rotated by angle $\theta$ about the $z$-axis. Thus, $\widehat{M}$ ---and thus also $M$ itself--- could be called a "scaled rotation": it rotates points about its axis, and simultaneously scales the distances of points from that axis.

score 1 · Answer 4 · 2014-03-10T22:36:08.723

1

There is a link with the quaternion skew-field $H=\{q=x+yi+zj+tk|x,y,z,t\in\mathbb{R}\}$ where $||q||^2=x^2+y^2+z^2+t^2$. If $||q||=1$, then $q$ is a unit and $q^{-1}==x-yi-zj-tk$. Moreover $e^q=e^x(\cos\sqrt{y^2+z^2+t^2}+\dfrac{\sin\sqrt{y^2+z^2+t^2}}{\sqrt{y^2+z^2+t^2}}(yi+zj+tk))$.

We consider a rotation $Rot(\theta,u)$ where $u=[a,b,c]^T$ is unitary. To $Rot$ we associate the quaternion $r=\dfrac{1}{2}\theta(ai+bj+ck)$. Thus we obtain the unit quaternion $q=e^r=\cos(\theta/2)+(ai+bj+ck)\sin(\theta/2)$. We identify $\mathbb{R}^3$ and $span(i,j,k)$. Then it can be proved that, if $v\in\mathbb{R}^3$, then $Rot(v)=qvq^{-1}$ where $q^{-1}=\cos(\theta/2)-(ai+bj+ck)\sin(\theta/2)$.

Let $A=\begin{pmatrix}0&z&-y\\-z&0&x\\y&-x&0\end{pmatrix}$ be a generic skew matrix. Its Cayley transform is a rotation $R$, then it is associated (after calculation of $R$) to a unit quaternion: $q=\dfrac{1}{\sqrt{1+t^2}}(1+xi+yj+zk)$ where $t^2=x^2+y^2+z^2$ ; indeed $(I-A)(I+A)^{-1}[\alpha,\beta,\gamma]^T=\dfrac{1}{1+t^2}(1+xi+yj+zk)(\alpha i+\beta j+\gamma k)(1-xi-yj-zk)$. Therefore $\cos(\theta/2)=1/\sqrt{1+t^2}$ and $t=\tan(\theta/2)$. Finally the unitary vector of the rotation is $u=\dfrac{1}{\tan(\theta/2)}[x,y,z]^T$.

edited Mar 10 '14 at 22:36

answered Mar 10 '14 at 21:48

Shouldn't be $A$ associated to quaternion $q=\frac1{2\sqrt{t^2}}(xi+yj+zk)$?. Than $e^q$ represents rotation $Rot(1,(x,y,z))$. – tom Mar 10 '14 at 23:22
Thanks, but it looks like you answered a different question from the one I asked. Looks like you did things in reverse. You took a skew symmetric matrix, calculated its Cayley transform (a rotation matrix), and then told me the geometric effect of that rotation. Maybe I'm missing something, but I don't see how this tells me about the geometry of my $A$ and $B$. – bubba Mar 10 '14 at 23:47
@tom, $A$ is not a rotation ; then I cannot associate to $A$ a quaternion. – Mar 11 '14 at 03:36
@bubba, you are right. I did not answer to your question. In fact, I thought that I could interpret your $A,B$ using quaternions, but I was wrong. – Mar 11 '14 at 03:40

score 0 · Answer 5 · answered Sep 18 '16 at 15:31

For unit quaternions $S^3$ at least, the interpretation is pretty straightforward. Consider an imaginary quaternion $x \in\mathfrak{s^3}$ i.e. in the tangent space at the identity, the Cayley transform is:

$$Cay(x) = \frac{1 - x}{1 + x} = \frac{ (1 - x)^2 }{1 + ||x||^2}=\left(\frac{1 - x}{||1 - x||} \right)^2$$

So the Cayley transform does this: start from the identity, add $-x$, normalize (so you get a unit quaternion), then square. The part before squaring gives you an (open) half-hemisphere, which gives you the whole sphere except -1 after squaring.

You can probably use this as a starting point to see how it transports to rotation matrices, e.g. using the adjoint representation of $S^3$ over its Lie algebra $\mathfrak{s^3}$: consider $Ad_{Cay(x)}$ and try to factor it as a product of reflections.

For imaginary quaternions, you can obtain reflections as $\frac{1}{2}\left(I \pm Ad_x\right)$.

Geometry of the Cayley Transform

5 Answers5

Linked