Pseudoinverse matrix and SVD

Question

I'm trying to solve an homework question but I got stuck.

Let A be a m x n matrix with the SVD $A = U \Sigma V^*$ and $A^+ = (A^* A)^{-1} A^*$ its pseudoinverse.

I'm trying to get $A^+ = V \Sigma^{-1} U^*$, but I'm missing something.

Can anyone help me with this please?

Thanks!

If you are content with Sivaram Ambikasaran's or my answer you might want to accept one of them so that the question will be recognised as an answered one. — Rasmus, Feb 19 '11 at 03:44
Saying "SVD decomposition" is not quite unlike saying "enter your PIN number into the ATM machine"... — J. M. ain't a mathematician, Aug 03 '11 at 08:31

score 15 · Accepted Answer · answered Feb 01 '11 at 22:31

15

$$ \begin{align} A^+ &= (A^*A)^{-1}A^* \\ &=(V\Sigma U^*U\Sigma V^*)^{-1} V\Sigma U^* \\ &=(V\Sigma^2 V^*)^{-1} V\Sigma U^* \\ &=(V^*)^{-1} \Sigma^{-2} V^{-1} V\Sigma U^* \\ &= V \Sigma^{-2}\Sigma U^* \\ &= V\Sigma^{-1}U^* \end{align} $$ using the properties of the matrices $U,V,\Sigma$ in the Singular_value_decomposition.

answered Feb 01 '11 at 22:31

Rasmus

18,946

Hi Rasmus! I could get by myself until 3rd line. The 4th one was my point of doubt. I forgot to invert the $\left( \cdot \right)^{-1}$ sequence! Thanks in pointing that! =) – paulochf Feb 02 '11 at 15:12
It could happen that $\Sigma^{-1}$ does not exist. If $\sigma_i=0$ for some $i$. – Herman Jaramillo Aug 06 '19 at 20:44
3

If $\Sigma$ is not square (and thus not invertible), the result still holds! Just show directly that
$(V \Sigma' \Sigma V')^{-1} = V(\Sigma' \Sigma )^+V'$ and that $(\Sigma' \Sigma )^+ \Sigma' = \Sigma^+$. The result will follow by adjusting the derivation in the answer. I'm not super familiar with the pseudoinverse + notation, but hopefully I'm using it correctly.
– Smithey Jun 27 '20 at 02:10

score 3 · Answer 2 · edited Oct 30 '18 at 23:50

3

First you need to assume that the matrix $A^*A$ is invertible. For which you need $n \leq m$ and rank($A$) is $n$.

So when $n \leq m$ and when rank($A$) is $n$, then the reduced SVD of $A$ is $A = U \Sigma V^*$ where $U \in \mathbb{R}^{m \times n}$, $\Sigma \in \mathbb{R}^{n \times n}$ and $V \in \mathbb{R}^{n \times n}$ such that $U^* U = I_{n \times n}$, $V^* V = I_{n \times n}$, $V V^* = I_{n \times n}$ and $\Sigma$ is a square diagonal matrix and has only (positive) real entries.

Note that $V^{-1}=V^*$.

Also note that $A^* = V \Sigma^* U^* = V \Sigma U^*$ since $\Sigma^* = \Sigma$.

Further note that if $M_1,M_2 \text{and} M_3$ are invertible matrices then $(M_1 M_2 M_3)^{-1} = M_3^{-1} M_2^{-1} M_1^{-1}$.

Use these to get the final answer.

edited Oct 30 '18 at 23:50

jncraton

103
3

answered Feb 01 '11 at 22:08

Given that the question is homework - which I realise only now - it would have probably been better to restrict to hints like you did. Sorry. – Rasmus Feb 01 '11 at 22:37
Hey Sivaram, I was in a hurry when I posted, so it was a (bad one) typo. Thanks for fixing the title! – paulochf Feb 02 '11 at 15:05
I knew all these hints but I missed your "further note" and did not invert the sequence inside those parentheses. Thanks for your patience! – paulochf Feb 02 '11 at 15:14

score 3 · Answer 3 · answered Dec 17 '16 at 02:01

If the dimensions of A are m x n and $m\not\equiv n$ then there isn't any way of deriving $A^+ = U\Sigma^{-1}V$. The reason is because $\Sigma$ has the same dimensions as $A$ therefore it is not invertible. If you see any source about SVD you will see that the equation is $A = U_{mxm} \Sigma_{mxn}V^T_{nxn}$. If A is rectangular maybe the possible derivation you're looking for is $$\begin{align} A^+ &=(A^TA)^{-1}A^T\\ &=(V\Sigma^TU^T U\Sigma V^T)^{-1}V\Sigma^TU^T\\ &=(V\Sigma^T\Sigma V^T)^{-1}V\Sigma^TU^T \\ &=(V^T)^{-1}(\Sigma^T \Sigma)^{-1}V^{-1}V\Sigma^TU^T \\ &=V(\Sigma^T \Sigma)^{-1}\Sigma^TU^T \end{align}$$

The result still holds for non-square matrices (unless I made a mistake); see my comment above — Smithey, Jun 27 '20 at 02:12

score 0 · Answer 4 · answered Jun 24 '24 at 15:01

Let $\mathbb{K} \in \{\mathbb{R},\mathbb{C}\}$ be either the field of real numbers $\mathbb{R}$ or the field of complex numbers $\mathbb{C}$. There are multiple ways of defining the Moore-Penrose pseudo-inverse. We choose the following.

Definition (Singular value decomposition). A singular value decomposition of a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ is a matrix decomposition \begin{equation} \mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*, \end{equation} for two unitary matrices $\mathbf{U} \in \mathbb{K}^{m \times m}$ and $\mathbf{V} \in \mathbb{K}^{n \times n}$, and a rectangular diagonal matrix \begin{equation} \boldsymbol{\Sigma} = \mathbf{diag}(\sigma_1,\sigma_2,\ldots,\sigma_p), \qquad p = \min\{m,n\}, \end{equation} such that \begin{equation} \sigma_1 \geq \sigma_2 \geq \ldots \geq \sigma_p \geq 0. \end{equation} We call diagonal entries $(\sigma_i)_{i=1}^p$ of $\boldsymbol{\Sigma}$ the singular values, the columns $(\mathbf{u}_i)_{i=1}^m$ of $\mathbf{U}$ the left-singular vectors and the columns $(\mathbf{v}_i)_{i=1}^n$ of $\mathbf{V}$ the right-singular vectors of the singular value decomposition.

Remark. The singular value decomposition always exists, but is not unique.

Definition. (Moore-Penrose pseudo-inverse). For a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ with $\mathrm{rank}(\mathbf{A}) = r$ and a singular value decomposition $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$, the Moore-Penrose pseudo-inverse of $\mathbf{A}$ is the matrix \begin{equation} \mathbf{A}^\dagger := \mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \in \mathbb{K}^{n \times m}, \end{equation} where \begin{equation} \boldsymbol{\Sigma}^\dagger = \mathbf{diag}(1/\sigma_1,1/\sigma_2,\ldots,1/\sigma_r,0,\ldots,0) \in \mathbb{K}^{n \times m}. \end{equation} Note that the pseudo-inverse of a tall matrix is wide, and of a wide matrix is tall.

Proposition. For any tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the matrix $\mathbf{A}^*\mathbf{A}$ is invertible.

Proof. Firstly, notice that \begin{equation} \mathbf{x} \in \ker(\mathbf{A}) \implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{A}^*\mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}), \end{equation} and that \begin{align*} \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}) &\implies \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x}^* \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \Vert \mathbf{A} \mathbf{x} \Vert_2^2 = \mathbf{0} \\ &\implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}). \end{align*} Hence, $\ker(\mathbf{A}) = \ker(\mathbf{A}^* \mathbf{A})$, and by the rank nullity theorem, \begin{equation} \mathrm{rank}(\mathbf{A}) = n - \dim(\ker(\mathbf{A})) = n - \dim(\ker(\mathbf{A}^* \mathbf{A})) = \mathrm{rank}(\mathbf{A}^*\mathbf{A}). \end{equation} And in particular, as $\mathbf{A}^* \mathbf{A} \in \mathbb{K}^{n \times n}$ is a square full rank matrix, its is invertible. $\qquad \square$

Proposition. For a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the Moore-Penrose pseudo inverse can be represented as \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \end{equation}

Proof. Consider a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), and let $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$ be any singular value decomposition of $\mathbf{A}$. Notice that the rank is preserved under the application of unitary matrices, so $\mathrm{rank}(\mathbf{A}) = \mathrm{rank}(\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) = \mathrm{rank}(\boldsymbol{\Sigma})$. Hence, all singular values $(\sigma_i)_{i=1}^n$ on the diagonal of $\boldsymbol{\Sigma}$ are non-zero. Firstly, we show that the representation holds for $\mathbf{A} = \boldsymbol{\Sigma}$. We have \begin{equation} \boldsymbol{\Sigma}^* \boldsymbol{\Sigma} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} \begin{bmatrix} \sigma_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n \\ \vdots & \ddots & \vdots \\ 0 & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix}, \end{equation} so \begin{equation} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix} \begin{bmatrix} 1/\sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & 1/\sigma_n & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} = \boldsymbol{\Sigma}^*. \end{equation} Clearly, $\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}$ is invertible, and \begin{equation} \boldsymbol{\Sigma}^\dagger = (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma})^{-1}\boldsymbol{\Sigma}^*. \end{equation} Therefore, \begin{align*} (\mathbf{A}^* \mathbf{A}) \mathbf{A}^\dagger &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*)^* (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) (\mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^*) \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \underbrace{\mathbf{U}^* \mathbf{U}}_{=\mathbf{I}} \boldsymbol{\Sigma} \underbrace{\mathbf{V}^* \mathbf{V}}_{=\mathbf{I}} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \mathbf{U}^* \\ &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V})^* \\ &= \mathbf{A}^*. \end{align*} By the proposition above, $\mathbf{A}^* \mathbf{A}$ is also invertible, and we conclude that \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \qquad \square \end{equation}

Pseudoinverse matrix and SVD

4 Answers4