2

I'm trying to show that in the polar decomposition $A= RL$ of $A\in O(p,q), \; p,q\geq 1$, $R\in O(p)\times O(q)$. Here I define $O(p,q) =\{A\in M(n,n,\mathbb{R})|A^TI_{p,q}A=I_{p,q}\} $, where $$I_{p,q} = \begin{pmatrix}I_p & 0 \\ 0 & -I_q \end{pmatrix}$$ Also, $L$ is assumed to be symmetric positive definite.

For a proof, I start with the fact that since $A^T A $ is symmetric and positive definite, it has a unique positive definite square root, call it $L$. That is, $L = \sqrt{A^TA}$. Set $R=AL^{-1}$. From here, we have $$R^TI_{p,q}R = L^{-1}A^TI_{p,q}L^{-1} =L^{-1}I_{p,q}L^{-1}$$ It seems like I should use the fact that $A^TA$ is necessarily in $O(p,q)$ since $A\in O(p,q)$. However, this doesn't seem to get me closer to showing that $R$ is a block diagonal matrix with first block in $O(p)$ and second block in $O(q)$.

Any help would be greatly appreciated!

user8675309
  • 12,193
slowspider
  • 1,115

1 Answers1

3

You have $A \in H:=O(p,q)$ the stabilizer group $A^TI_{p,q}A=I_{p,q}$ (where we have some fixed choice of $p,q$ and $p+q=n$) for some $A\in GL_n(\mathbb R)$.

If you consider the special case of $U\in H\cap O_n(\mathbb R)$ then $U^T I_{p,q}U =I_{p,q}\implies I_{p,q}U =UI_{p,q}$ and by direct calculation it follows that $U$ is a block diagonal matrix with a leading $p\times p$ orthogonal matrix followed by a $q\times q$ orthogonal matrix. Thus the core of the proof requires showing that $(A^TA)^\frac{1}{2}\in H$ and everything else follows.

In general we know that the polar decomposition $A=UP$ for invertible matrices exists (e.g. using SVD) and is unique. You've already inferred that $A\in H \implies A^T \in H \implies (A^TA) \in H$. Since $P = \big(A^TA\big)^\frac{1}{2}$, it remains to prove:
$P\in H$ since $P\in H\implies U\in H$ and paragraph 2 gives the result.


shorter proof
$B:=A^TA$ with $m$ distinct eigenvalues $\lambda_1\gt \lambda_2\gt \cdots \gt \lambda_{m}\gt 0$. Our stabilizer equation tells us
$BI_{p,q} =I_{p,q}B^{-1}$ and in general $B^k I_{p,q} =I_{p,q}B^{-k}$ so for polynomial $Q$

$Q\big(B\big)I_{p,q}=I_{p,q}Q\big(B^{-1}\big)$
now using e.g. a Vandermonde Matrix select $Q$ to be a degree at most $2m-1$ polynomial such that
$Q(\lambda_j) = \lambda_j^\frac{1}{2}$ and $Q(\lambda_j^{-1}) = (\lambda_j^{-1})^\frac{1}{2}$

$\implies B^\frac{1}{2}I_{p,q}= Q\big(B\big)I_{p,q}=I_{p,q}Q\big(B^{-1}\big)=I_{p,q}\big(B^{-1}\big)^\frac{1}{2}=I_{p,q}\big(B^\frac{1}{2}\big)^{-1}$
$\implies B^\frac{1}{2}I_{p,q}B^\frac{1}{2}=I_{p,q}$
i.e. $P\in H$ as desired.


original proof
Let $A^TA$ have $m\geq 2$ distinct eigenvalues $\lambda_1\gt \lambda_2\gt \cdots \gt \lambda_{m}\gt 0$ with associated Define the symmetric non-degenerate bilinear form $\langle,\rangle : \mathbb R^n \times \mathbb R^n\longrightarrow \mathbb R$ given by $\langle \mathbf x, \mathbf y\rangle = \mathbf x^T I_{p,q}\mathbf y$.

Let $A^TA$ have $m\geq 2$ distinct eigenvalues $\lambda_1\gt \lambda_2\gt \cdots \gt \lambda_{m}\gt 0$ with associated eigenspaces $W_1, \dots, W_m$. Consider how they behave under the form, by selecting non-zero vectors $\mathbf w_i\in W_i$ and $\mathbf w_k \in W_k$.

$\langle \mathbf w_i,\mathbf w_k \rangle = \langle (A^TA) \mathbf w_i,(A^TA)\mathbf w_k \rangle= \lambda_i\cdot \lambda_j\cdot \langle \mathbf w_i,\mathbf w_k \rangle = \implies (1-\lambda_i\cdot \lambda_j)\cdot\langle \mathbf w_i,\mathbf w_k \rangle = 0$

That is $\lambda_k \neq \lambda_i^{-1} \implies \langle \mathbf w_i,\mathbf w_k \rangle =0$ or equivalently
$\langle \mathbf w_i,\mathbf w_k \rangle =0$ if $k\neq m-i+1$
notice this exact same orthogonality relationship holds when we compute $ \langle P \mathbf w_i,P\mathbf w_k \rangle$
and more generally
$\langle (A^TA) \mathbf w_i,(A^TA)\mathbf w_k \rangle=\langle \mathbf w_i,\mathbf w_k \rangle = \langle P \mathbf w_i,P\mathbf w_k \rangle$ which essentially gives us our answer. For an explicit finish using bilinear forms:

With $\dim W_k = r_k$ build a basis for each eigenspace. So e.g. $\big\{\mathbf w_1^{(1)},\dots,\mathbf w_1^{(r_1)}\big\}$ makes a basis for $W_1$. Now
$P \in H \iff \langle \mathbf x, \mathbf y\rangle =\langle P\mathbf x, P\mathbf y\rangle$ for all $\mathbf x, \mathbf y \in \mathbb R^n$

using the basis vectors for the eigenspace of $A^TA$ write, for arbitrary $\mathbf x,\mathbf y$
$\mathbf x=\sum_{i=1}^m \sum_{j=1}^{r_i} \alpha_{i,j} \mathbf w_i^{(j)}$
$\mathbf y=\sum_{k=1}^m \sum_{j'=1}^{r_k} \eta_{k,j'} \mathbf w_k^{(j')}$

$\langle \mathbf x, \mathbf y\rangle$
$=\langle \sum_{i=1}^m\sum_{j=1}^{r_i} \alpha_{i,j} \mathbf w_i^{(j)}, \sum_{k=1}^m \sum_{j=1}^{r_k} \eta_{k,j'} \mathbf w_k^{(j')}\rangle$
$=\sum_{i=1}^m \sum_{j=1}^{r_i}\alpha_{i,j}\cdot\langle \mathbf w_i^{(j)}, \sum_{k=1}^m\sum_{j'=1}^{r_k}\eta_{k,j'}\mathbf w_k^{(j')}\rangle$
$=\sum_{i=1}^m \sum_{j=1}^{r_i}\sum_{k=1}^m\sum_{j'=1}^{r_k}\alpha_{i,j}\cdot\eta_{k,j'}\cdot\langle \mathbf w_i^{(j)}, \mathbf w_k^{(j')}\rangle$
$=\sum_{i=1}^m \sum_{j=1}^{r_i}\sum_{k=1}^m\sum_{j'=1}^{r_k}\alpha_{i,j}\cdot\eta_{k,j'}\cdot\langle P\mathbf w_i^{(j)}, P\mathbf w_k^{(j')}\rangle$
$=\sum_{i=1}^m \sum_{j=1}^{r_i}\alpha_{i,j}\cdot\langle P\mathbf w_i^{(j)}, \sum_{k=1}^m\sum_{j'=1}^{r_k}\eta_{k,j'}P\mathbf w_k^{(j')}\rangle$
$=\langle \sum_{i=1}^m\sum_{j=1}^{r_i} \alpha_{i,j} P\mathbf w_i^{(j)}, \sum_{k=1}^m \sum_{j=1}^{r_k} \eta_{k,j'} P\mathbf w_k^{(j')}\rangle$
$=\langle P\sum_{i=1}^m\sum_{j=1}^{r_i} \alpha_{i,j} \mathbf w_i^{(j)}, P\sum_{k=1}^m \sum_{j'=1}^{r_k} \eta_{k,j'} \mathbf w_k^{(j')}\rangle$
$=\langle P\mathbf x, P\mathbf y\rangle$
$\implies P\in H$

which completes the proof.

alternative matrix based finish
This is conceptually simple though notationally difficult, so I give an outline.
Let the basis vectors for the eigenspaces $W_k$ be orthonormal under the dot product and collect the eigenvectors in orthogonal matrix $V$.

$V^T (A^TA)V = \left[\begin{matrix}\lambda_1 I_{r_1} & \mathbf 0&\cdots &\mathbf 0\\ \mathbf 0& \lambda_2 I_{r_2} &\cdots &\mathbf 0\\ \vdots & \vdots & \ddots & \vdots\\ \mathbf 0&\mathbf 0&\cdots &\lambda_mI_{r_m} \end{matrix}\right]=D$

$V^T I_{p,q}V$
is a symmetric block anti-diagonal matrix, e.g. with $C_1 \in \mathbb R^{r_m\times r_1}$ in its bottom left corner and $C_1^T$ in the top right corner, then $C_2$ and so on.

$(V^T I_{p,q}V)^2=I_n \implies r_m=r_1$ (because $C_1^T C_1 = I_{r_1}$ and $C_1C_1^T = I_{r_m}$) i.e. each block on the anti-diagonal is square. Examining the stabilizer equation
$(A^TA)I_{p,q}(A^TA)= I_{p,q}$
$\implies D (V^T I_{p,q}V) D= V^T(A^TA)I_{p,q}(A^TA)V= V^TI_{p,q}V$
and confirm that the symmetric multiplication by $D$ on the LHS is equivalent to multiplying each block on the anti-diagonal of $\big(V^T I_{p,q}V\big)$ by $\lambda_i \cdot \lambda_i^{-1}=1$ and this survives even if we take square-roots of all eigenvalues, giving the proof.

user8675309
  • 12,193