26

Let $A$ be an $m \times n$ matrix with a singular value decomposition $A=U\Sigma V^T$. Show that the matrix $Q=UV^T$ is the nearest orthogonal matrix to $A$, i.e.,

$$\min_{Q^TQ=I_{n \times n}} \|A-Q\|_F$$

BadAtMath
  • 439
  • 3
    If $A$ is $m\times n$, then also $UV^T$ is $m\times n$. So, I guess it is assumed that $m = n$? – Friedrich Philipp Apr 03 '17 at 02:09
  • 2
    @FriedrichPhilipp I think he meant to say that $m>n$ and that $Q$ is orthogonal on the rows (such that $Q'Q$ is identity). – Car Loz Dec 07 '21 at 16:11
  • 1
    The solution when $n=m$ has been stated below. For the most general case where A is not even full-column rank, this paper proved the statement: https://doi.org/10.1007%2FBF02289451 – Car Loz Dec 07 '21 at 16:42

1 Answers1

17

Note that, as stated, the question only makes sense if $n=m$, because in the singular value decomposition of $A$, $U$ will be $m\times m$ and $V^T$ will be $n\times n$.

Because the Frobenius norm is unitarily invariant, you have $$ \|A-Q\|_F=\|U\Sigma V^T-Q\|_F=\|\Sigma-U^TQV\|_F. $$ But the orthogonal (or the unitary) matrices form a group, so you want to minimize $$ \|\Sigma-Q\|_F $$ over all orthogonal matrices. You have \begin{align} \|\Sigma-Q\|_F^2&=\sum_k(\Sigma_{kk}-Q_{kk})^2+\sum_{j\ne k}Q_{kj}^2\\ \ \\ &=\sum_k(\Sigma_{kk}^2+Q_{kk}^2-2\Sigma_{kk}Q_{kk})+\sum_{j\ne k}Q_{kj}^2\\ \ \\ &=\sum_k(\Sigma_{kk}^2-2\Sigma_{kk}Q_{kk})+\sum_{j,k}Q_{kj}^2\\ \ \\ &=\text{Tr}(\Sigma^2)+\text{Tr}(Q^TQ)-2\sum_k\Sigma_{kk}Q_{kk}\\ \ \\ &=\text{Tr}(\Sigma^2)+n-2\sum_k\Sigma_{kk}Q_{kk} \end{align} To minimize this quantity over $Q$, since the entries of $\Sigma$ are non-negative and $Q_{kk}\in[-1,1]$, we need to choose $Q_{kk}=1$ for all $k$, which makes $Q=I$.

So the minimum is $$ \|\Sigma-I\|_F=\|U\Sigma V^T-UV^T\|_F=\|A-UV^T\|_F. $$

Martin Argerami
  • 217,281
  • It is $Tr(\Sigma^2)$, not $Tr(\Sigma)$. It should also be $|\Sigma-Q|_F^2$ with square. – Friedrich Philipp Apr 03 '17 at 01:55
  • Indeed! Editing. – Martin Argerami Apr 03 '17 at 02:26
  • 1
    Some SVD formulas are based on ${U}$ being ${m}$x${q}$, $\Lambda$ being ${q}$x${q}$ where ${q}$ is the number of positive singular values, and ${V^T}$ being ${q}$x${n}$, so the original question can perhaps go through if we just assume orthonormal columns for ${UV^T}$. – FinanceGuyThatCantCode Jul 01 '19 at 20:20
  • 2
    The original question does make sense when $m>n$. – Car Loz Dec 07 '21 at 16:27
  • 1
    I'd just like to reiterate Carlos' statement that the original question does make sense when $m > n$ (using a reduced SVD). Theorem 4 of Sparse Principal Component Analysis by Zhou, Hastie and Tibshirani give the required result when $m > n$. The proof on the wikipedia page for Orthogonal Procrustes problem also works. – Student Mar 16 '22 at 05:41