4

I wanted to see what would happen if I applied the Daleckii-Krein Theorem to a quaternion function via its $4\times 4$ matrix representation.


$ \def\h{\odot} \def\k{\otimes} \def\bb{\mathbb} \def\bbC#1{{\bb C}^{#1}} \def\bbH#1{{\bb H}^{#1}} \def\bbR#1{{\bb R}^{#1}} \def\a{\alpha} \def\b{\beta} \def\g{\gamma} \def\l{\lambda} \def\o{{\tt1}} \def\z{{\bf 0}} \def\e{{\bf e}}\def\n{{\bf n}} \def\CMR#1#2{\left\lbrace #1 \; \middle| \; #2 \right\rbrace} \def\CR#1{\left\lbrace #1\right\rbrace} \def\LR#1{\left(#1\right)} \def\lR#1{\Big(#1\Big)} \def\BR#1{\left[#1\right]} \def\bR#1{\Big[#1\Big]} \def\op#1{\operatorname{#1}} \def\xn#1#2{\left\| #2 \right\|_{\small#1}} \def\frob#1{\left\| #1 \right\|_{\small F}} \def\frob#1{\xn F{#1}} \def\Real#1{\op{\sf Real}\LR{#1}} \def\Imag#1{\op{\sf Imag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\vc#1{\op{vec}\LR{#1}} \def\quat#1{\op{Quat}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\m#1{\left[\begin{array}{r}#1\end{array}\right]} \def\mmm#1{\left[\begin{array}{rrr|rrr|rrr}#1\end{array}\right]} \def\mmmm#1{\left[\begin{array}{rrrr|rrrr|rrrr|rrrr}#1\end{array}\right]} \def\mc#1{\left[\begin{array}{c|c}#1\end{array}\right]} \def\mq#1{\left[\begin{array}{r|rrr}#1\end{array}\right]} \def\mqq#1{\left[\begin{array}{rr|rr}#1\end{array}\right]} \def\BEvec#1{\begin{Bmatrix}#1\end{Bmatrix}} \def\fracbR#1#2{\bR{\frac{#1}{#2}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\q{\quad} \def\qq{\qquad} \def\qiq{\q\implies\q} \def\qif{\q\iff\q} \def\qip{\q\Longleftarrow\q} \def\T{{\sf T}} $It is well known that a quaternion can be represented by a real matrix $$\eqalign{ a = \BEvec{a_0\\ a_1\\ a_2\\ a_3}, \qq A = \quat{a} = \mq{ a_0 & -a_1 & -a_2 & -a_3 \\ \hline a_1 & a_0 & -a_3 & a_2 \\ a_2 & a_3 & a_0 & -a_1 \\ a_3 & -a_2 & a_1 & a_0 } }$$ There are many other ways to map a quaternion to a $4\times 4$ matrix, but I like this particular representation because the first column contains the components of the quaternion in order and with the proper sign.

I'm not really a fan of quaternions per se. I think they are overhyped and misused. But the matrix representation forms a nice closed group under addition and multiplication, and has other interesting properties $$\eqalign{ \a^2 &\equiv \frob{a}^2 = \tfrac14\,\frob{A}^2 = {\sqrt{\det(A)}} \\ \g^2 &\equiv a_1^2 + a_2^2 + a_3^2 \\ \l &= a_0 + i\g \qq {\{ {\rm eigenvalue} \}} \\ \l^*\l &= \a^2 \\ A^\T A &= \a^2 \qiq A^{-1} = \a^{-2}A^\T \\ }$$ The eigenvalue $\l$ is associated with two independent eigenvectors, e.g. $$\eqalign{ v = \mc{ a_1a_2+i\g a_3 \\ a_1a_3-i\g a_2 \\ 0\; \\ a_2^2 + a_3^2\; \\ }, \qq \q w = \mc{ -a_1a_3+i\g a_2 \\ \;\;a_1a_2+i\g a_3 \\ a_2^2 + a_3^2 \\ 0 \\ } }$$ The remaining eigenpairs are simply complex conjugates of these.

The following matrix $$\small\eqalign{ H^\T = \mmmm{ \o & 0 & 0 & 0 & 0 &\o & 0 & 0 & 0 & 0 &\o & 0 & 0 & 0 & 0 &\o \\ 0 &\o & 0 & 0 &-\o & 0 & 0 & 0 & 0 & 0 & 0 &\o & 0 & 0 &-\o & 0 \\ 0 & 0 &\o & 0 & 0 & 0 & 0 &-\o &-\o & 0 & 0 & 0 & 0 &\o & 0 & 0 \\ 0 & 0 & 0 &\o & 0 & 0 &\o & 0 & 0 &-\o & 0 & 0 &-\o & 0 & 0 & 0 \\ } \\ }$$ can be used to vectorize/devectorize the matrix representation $$\eqalign{ \vc{A} = Ha \qif a = \tfrac14H^\T\vc{A} \\ }$$


With the preliminaries out of the way, we've arrived at the main topic of this post.

Given a differentiable function $$\eqalign{ \def\f{\phi} \f = \f(\l) \qiq \psi = \frac{d\f}{d\l} \\ }$$ the DK Theorem states that if we apply the function to a square matrix we can calculate its Frechet derivative like so $$\eqalign{ \def\DVR{\Diag{\vc{R}}} \def\R{\;{\large\cal R}\;} \def\V{V^{-1}} L &= \Diag{\l,\l,\l^*,\l^*}, \q V=\m{v\;w\;v^*\;w^*} \\ A &= VL\V \\ F &= \f(A) \\ dF &= V\BR{R\h\LR{\V\:dA\:V}}\V \\ }$$ where $\h$ denotes the Hadamard product and the components of the $R$ matrix are $$\eqalign{ R_{jk} &= \begin{cases} {\large\frac{\f(\l_j)-\f(\l_k)}{\l_j-\l_k}} \qq {\rm if}\;\;\l_j\ne\l_k \\ \\ \q\psi(\l_k) \qq\q\;\; {\rm otherwise} \qq \qq \end{cases} \\ }$$ Due to the nature of the eigenvalues in this problem, $R$ is a symmetric block matrix $$\eqalign{ \b &= \frac{\f(\l)-\f(\l^*)}{\l-\l^*} = \frac{\Imag{\f(\l)}}{\g} \\ R &= \mqq{ \psi & \psi & \b & \b \\ \psi & \psi & \b & \b \\\hline \b & \b & \psi^* & \psi^* \\ \b & \b & \psi^* & \psi^* \\ } = R^\T \\ \\ \R\! &= \DVR \\ }$$ The matrix function itself can be evaluated in various ways, for example $$\eqalign{ F &= V\:\f(L)\:\V = \quat{f} \qip f=\f(a)=\BEvec{\Real{\f(\l)}\\ \b\,a_1\\ \b\,a_2\\ \b\,a_3} \\ }$$

By vectorizing the Frechet derivative and utilizing the $H$ matrix, we can easily derive an expression for the Jacobian of the quaternion function $\;f=\f(a)$ $$\eqalign{ \vc{dF} &= \vc{V\BR{R\h\LR{\V\:dA\:V}}\V} \\ H\,df &= \LR{\V\k V^\T}^\T\R \LR{V^\T\k\V}\,\vc{dA} \\ df &= \frac14\,H^\T\LR{\V\k V^\T}^\T\R \LR{V^\T\k\V} H\,da \qq \qq \\ J\,\equiv\,\grad fa &= \frac14\,H^\T\LR{\V\k V^\T}^\T\R \LR{V^\T\k\V} H \\ }$$ The interesting thing is that, after testing with a variety of functions and quaternions, the Jacobian always takes the form $$\eqalign{ \def\s{\sigma} J = \mq{ \Real\psi & \s a_1 & \s a_2 & \s a_3 \\ \hline -\s a_1 & S_{11} & S_{12} & S_{13} \\ -\s a_2 & S_{21} & S_{22} & S_{23} \\ -\s a_3 & S_{31} & S_{32} & S_{33} \\ } \in \bbR{4\times 4} }$$ The values of the scale factor $\s$ and the elements of the $3\times 3$ matrix $S$ appear to be random, with no simple connection to $\CR{a,f,\l, etc}.$

However, I have observed that $S$ is symmetric, which means that the Jacobian matrix itself cannot represent a quaternion. Furthermore, the diagonal elements are unequal.

Interestingly, if you use this Daleckii-Krein method on the $2\times 2$ representation of a Complex variable, the resulting Jacobian is a $2\times 2$ matrix representing the complex number $\psi$ (as expected).

I assume that analysts who regularly deal with quarternions need this Jacobian, e.g. to invoke the Chain Rule. While $J$ is a $4\times 4$ matrix, it does not represent a quaternion, therefore it cannot be calculated using standard quaternion operations. So how does one calculate it without resorting to matrix methods?

Also, is there a formula for the $\CR{\s,S_{ij}}$ values?


I've turned up a few articles related to this topic. These papers attempt to construct a quaternionic version of Complex Analysis, but are stymied by the non-commuting nature of the quaternion product and the Cauchy-Riemann Equation.

Disappointingly, the gradient formulas in the papers produce incorrect results when I test them numerically.

greg
  • 40,033
  • a quaternion is isomorphic to a real matrix - FYI this does not make technical sense. You can say it can be realized or represented as a real matrix. – Kimball Feb 11 '25 at 14:07
  • If I'm understanding correctly, the scalar function $\phi$ has to be a function of complex numbers? And it's applied to the matrix $\mathrm{Quat}(a)$ by applying it to the eigenvalues in the diagonalization? That's probably something you should include. If you want to connect this to quaternions proper, then you should probably interpret the eigenvalues/eigenvectors in terms of quaternions. I believe we have $\mathrm{Quat}(a)b = a(b+\mathop{\mathrm{Re}}b)$ where on the LHS the quaternion $b$ is considered as a column vector, and on the RHS we use the quaternion product. – Nicholas Todoroff Feb 11 '25 at 20:28
  • So the eigenvalue equation in terms of quaternions is $a(b+\mathop{\mathrm{Re}}b) = \lambda b$. Let $a_0,b_0$ be the real parts of $a,b$ and $a',b'$ the imaginary parts. We can assume $|b|=1$, and then this condition is equivalent to $a(1 + b_0\bar b)$ being a scalar, which is equivalent to $$a' - a_0b_0b' + b_0^2a' - b_0a'\times b' = 0.$$ – Nicholas Todoroff Feb 11 '25 at 20:38
  • @Greg, can you explain how you deduce $f=[Real(\phi(\lambda));\beta a_1, \beta a_2, \beta a_3]$? The Jacobian is directly deduced from this expression. – Steph Feb 24 '25 at 18:09
  • @Steph Here is a sketch of the derivation. And you're right, component-wise differentiation of $f$ leads to the same result as my matrix manipulations, and is much simpler. – greg Feb 25 '25 at 14:18

1 Answers1

1

$ \def\a{\alpha} \def\b{\beta} \def\g{\gamma} \def\l{\lambda} \def\CR#1{\left\lbrace #1\right\rbrace} \def\LR#1{\left(#1\right)} \def\lR#1{\Big(#1\Big)} \def\BR#1{\left[#1\right]} \def\bR#1{\Big[#1\Big]} \def\op#1{\operatorname{#1}} \def\Real#1{\op{\sf Real}\LR{#1}} \def\Imag#1{\op{\sf Imag}\LR{#1}} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\mc#1{\left[\begin{array}{c|c}#1\end{array}\right]} \def\BEvec#1{\begin{Bmatrix}#1\end{Bmatrix}} \def\q{\quad} \def\qq{\qquad} \def\T{{\sf T}} \def\f{\phi} \def\s{\sigma} $In terms of the variable definitions $$\eqalign{ a &= \BEvec{a_0\\ a_1\\ a_2\\ a_3}, \qq v = \BEvec{a_1\\ a_2\\ a_3},\qq &\g=\sqrt{a_1^2 + a_2^2 + a_3^2} \\ \l &= a_0 + i\g, \qq \f =\f(\l),\qq &\psi = \psi(\l) = \frac{d\f}{d\l} \\ \b &= \frac{\Imag{\f}}{\g},\q \s = \frac{\Imag{\psi}}{\g},\q &\xi = \frac{\Real{\g\psi}-\Imag{\f}}{\g^3} \\ }$$ The Jacobian of the quaternion function $\,f=\f(a)\,$ is $$\eqalign{ J &= \grad fa = \mc{ \Real\psi & -\s v^\T \\ \hline \s v & \xi vv^\T+\b I \\ } \qq \qq \qq \qq \qq }$$ Still not sure if a formula exists which uses purely quaternion operations.

greg
  • 40,033