How do I construct the symplectic matrix in Williamson's theorem?

Question

tl;dr: How do I construct the symplectic matrix in Williamson's theorem?

I am interested in a constructive proof/version of Williamson's theorem in symplectic linear algebra. Maybe I'm just missing a simple step, so here is what I know:

Let us fix the symplectic form $J=\begin{pmatrix} 0_n & 1_n \\ -1_n & 0_n\end{pmatrix}$. Recall:

Theorem: Let $M\in\mathbb{R}^{2n\times 2n}$ be a positive-definite matrix, then there exists a symplectic matrix $S\in Sp(2n,\mathbb{R})$ and a diagonal matrix $D\in\mathbb{R}^{n\times n}$ such that $$M=S^T\tilde{D}S$$ where $\tilde{D}=\operatorname{diag}(D,D)$ is diagonal.

My basic interest is in how to construct the $S$ (i.e. write a matlab program that does this). This theorem is cited and used in various cases, however one cannot find many proofs despite the original. Here is the one proof I found (the most important step for me is unjustified).

Let us denote by $\langle \cdot,\cdot \rangle_M:=\langle \cdot, M\cdot \rangle$ the symmetric bilinear form defined by $M$ and by $\sigma(\cdot,\cdot)=\langle \cdot, J\cdot \rangle$ the symplectic form. The theorem asserts that there is an $M$-orthogonal and symplectic basis. The failure of the basis to be $M$-orthonormal is then given by $D$.

In order to find such a basis, it seems a good idea to look at $JM$. As this is a real matrix, which is antisymmetric with respect to $\langle\cdot,\cdot\rangle_M$, its eigenvalues come in pairs $\pm i\lambda_j$ ($j=1,\ldots n$) and the corresponding eigenvectors in pairs $e_j\pm if_j$ with real $e_j,f_j$ (the proof in the link considers the eigenvalues/eigenvectors of $M^{-1}J$ instead).

The claim in the proof is now that $\{e_j,f_j\}_j$ forms an $M$-orthonormal basis.

If this is the case, one can easily see that $\delta_{jk}=\langle e_j,Me_k\rangle=-\lambda_k\sigma(e_j,f_k)$ and $0=\langle e_j,Mf_k\rangle=\lambda_k\sigma(e_j,e_k)$ and similarly for $f_k$, hence the basis is indeed $M$-orthogonal and symplectic after normalization by $\sqrt{\lambda_j}$. The matrix $S$ should then consist of the normalized $f_j$ and $e_j$ as columns.

Q1: Why is $\{e_j,f_j\}_j$ an orthonormal basis w.r.t to $M$? Matlab suggests that most of the times, it is, but I seem to miss something fundamental, because I don't see it. Why most of the times? Eigenvalue multiplicities seem to play a role here:

Q2: If $\lambda_j\neq \lambda_k$ always, then Q1 seems to be true. If there are some eigenvalue multiplicities, this seems to be wrong, probably because of the non-uniqueness of eigenvectors. How can I fix this? I suspect that some Gram-Schmidt wrt. $M$ is necessary, but since I can't answer Q1, I can't see whether this does the trick.

You might want to have a look at my answer here: https://math.stackexchange.com/questions/4214236/how-to-find-s-given-x-such-that-x-sx-dst/4214357#4214357 , where I kind of complete your answer below. — amsmath, Oct 09 '21 at 12:32

score 6 · Accepted Answer · answered Apr 21 '15 at 12:00

I still can't answer question 1, but I can offer a fix for question 2 by offering a completely different proof of the theorem (this is essentially the same as here, but this was the only reference I could find.

Without loss of generality, we can write $S=M^{-1/2}K\overline{D}^{1/2}$ with $K\in O(2n)$, since then $S^TMS=\overline{D}$ by construction. We already know how to find $\overline{D}$ (spectrum of $JM$), so if we can find $K$ such that $S$ is symplectic, we are done.

Write: $$ S^TJS=J \quad \Leftrightarrow \quad K^TM^{-1/2}JM^{-1/2}K=\overline{D}^{-1/2}J\overline{D}^{-1/2} $$

The right hand side is of the form $$ \overline{D}^{-1/2}J\overline{D}^{-1/2}=\begin{pmatrix}{} 0_n & D^{-1} \\ -D^{-1} & 0_n \end{pmatrix}$$ and $M^{-1/2}JM^{-1/2}$ is antisymmetric, too. Up to a change of basis, we see that the right hand side is the real diagonalization of the antisymmetric left hand side (as in the spectral theory of skew-symmetric matrices).

This means we can find $K$ by finding the real diagonalization of $M^{-1/2}JM^{-1/2}$ in the other basis, which can be done e.g. by the (real) Schur decomposition (which coincidentally computes $D$ as well).

glS · Answer 2 · 2024-08-20T08:11:19.813

Here's another approach to proving Williamson's theorem (same idea as the other answer, different phrasing).

Problem statement

Let $M>0$ be a $2n\times 2n$ real (strictly) positive Hermitian matrix. Our goal is to show that there's a symplectic matrix $S\in\mathbf{Sp}(2n,\mathbb{R})$ such that $SMS^T = I_2\otimes D$ for some positive diagonal $n\times n$ matrix $D>0$. Here $I_2$ denotes the $2\times2$ identity, and $$I_2\otimes D\equiv D\oplus D \equiv \begin{pmatrix}D&0\\0&D \end{pmatrix}.$$ We also remember that $S$ being symplectic means it satisfies $$S(J\otimes I_n)S^T = J\otimes I_n, \qquad J \equiv \begin{pmatrix}0&1\\-1&0\end{pmatrix}.$$

Proof

To prove this, observe the following:

Given any orthogonal $O\in\mathbf{O}(2n)$, the matrix $S=(I_2\otimes \sqrt D) O M^{-1/2}$ is such that $SMS^T=I_2\otimes D$.
If we furthermore want $S$ to be symplectic, then we want $S(J\otimes I_n)S^T=(J\otimes I_n)$, where $J\equiv \begin{pmatrix}0&1\\-1&0\end{pmatrix}$. Using the expression above for $S$, this constraint amounts to $$(I_2\otimes \sqrt D) OM^{-1/2} (J\otimes I_n)M^{-1/2} O^T (I_2\otimes \sqrt D) = J\otimes I_n.$$
To find the orthogonal matrix $O$ satisfying the above relation we just need to be observe that $M^{-1/2} (J\otimes I_n)M^{-1/2}$ is a skew-symmetric real matrix. For any real skew-symmetric matrix $A$ there's a real orthogonal $O$ such that $OAO^T = J\otimes \Lambda$ for some diagonal real matrix $\Lambda$. It follows that we always find $O$ such that $$OM^{-1/2} (J\otimes I_n)M^{-1/2} O^T= J\otimes D^{-1}.$$ To be more precise, we here denoted with $D^{-1}$ the matrix of eigenvalues of $M^{-1/2}(J\otimes I_n)M^{-1/2}$. In other words, the $D$ at the beginning of the statement contains the reciprocals of the eigenvalues of this matrix.
With the above choice of $O$ we have $$S(J\otimes I_n)S^T= (I_2\otimes\sqrt D)(J\otimes D^{-1})(I_2\otimes \sqrt D) = J\otimes I_n,$$ that is to say, $S$ is symplectic.

In summary, we just showed that to find the symplectic $S$ such that $SMS^T=I_2\otimes D$, we must

Find the orthogonal matrix $O$ that "real diagonalises" the skew-symmetric matrix $A \equiv M^{-1/2} (J\otimes I_n)M^{-1/2}$. In practice, this can be done for example by computing the singular value decomposition of $A$ and grouping together the principal components corresponding to each singular value (we are ensured that for each singular value $s_k$ there must be exactly two orthonormal vectors $v_k,w_k$ such that $Av_k=s_k w_k$ and $Aw_k=-s_k v_k$). Building a matrix out of these vectors will produce $O$.
From $O$ just compute $S=(I_2\otimes \sqrt D)O M^{-1/2}$.

Examples

Diagonal 2x2 matrix

For a simple example suppose $$M = \begin{pmatrix}\alpha & 0 \\ 0 & \beta\end{pmatrix}$$ for some real $\alpha,\beta>0$. Then $$M^{-1/2} J M^{-1/2} = (\alpha\beta)^{-1/2} J,$$ and thus $O = I$, $D = \sqrt{\alpha\beta}$, and $$S = (\alpha\beta)^{1/4} M^{-1/2} = \begin{pmatrix}\gamma&0\\0&1/\gamma\end{pmatrix}, \quad \gamma\equiv (\beta/\alpha)^{1/4}.$$ It's easy to then verify that $S$ is indeed symplectic, and satisfies $SMS^T=\sqrt{\alpha\beta}I$.

Diagonal 4x4 matrix

For a slightly less trivial toy example consider $$M = \operatorname{diag}(\alpha,\beta,\gamma,\delta)$$ for some real $\alpha,\beta,\gamma,\delta>0$. Then we have $$M^{-1/2}(J\otimes I) M^{-1/2} = J \otimes D^{-1}, \qquad D\equiv \begin{pmatrix}\sqrt{ac}&0\\0&\sqrt{bd}\end{pmatrix}.$$ In this case $O=J\otimes I$ works, $D$ is the one above, and therefore $$S = \begin{pmatrix}0&0&(\alpha/\gamma)^{1/4}&0\\ 0&0&0&(\beta/\delta)^{1/4} \\ -(\gamma/\alpha)^{1/4} & 0 & 0 & 0 \\ 0 & -(\delta/\beta)^{1/4} & 0 &0 \end{pmatrix}. $$