12

I'm trying to solve an homework question but I got stuck.

Let A be a m x n matrix with the SVD $A = U \Sigma V^*$ and $A^+ = (A^* A)^{-1} A^*$ its pseudoinverse.

I'm trying to get $A^+ = V \Sigma^{-1} U^*$, but I'm missing something.

Can anyone help me with this please?

Thanks!

paulochf
  • 223

4 Answers4

15

$$ \begin{align} A^+ &= (A^*A)^{-1}A^* \\ &=(V\Sigma U^*U\Sigma V^*)^{-1} V\Sigma U^* \\ &=(V\Sigma^2 V^*)^{-1} V\Sigma U^* \\ &=(V^*)^{-1} \Sigma^{-2} V^{-1} V\Sigma U^* \\ &= V \Sigma^{-2}\Sigma U^* \\ &= V\Sigma^{-1}U^* \end{align} $$ using the properties of the matrices $U,V,\Sigma$ in the Singular_value_decomposition.

Rasmus
  • 18,946
  • Hi Rasmus! I could get by myself until 3rd line. The 4th one was my point of doubt. I forgot to invert the $\left( \cdot \right)^{-1}$ sequence! Thanks in pointing that! =) – paulochf Feb 02 '11 at 15:12
  • It could happen that $\Sigma^{-1}$ does not exist. If $\sigma_i=0$ for some $i$. – Herman Jaramillo Aug 06 '19 at 20:44
  • 3
    If $\Sigma$ is not square (and thus not invertible), the result still holds! Just show directly that

    $(V \Sigma' \Sigma V')^{-1} = V(\Sigma' \Sigma )^+V'$ and that $(\Sigma' \Sigma )^+ \Sigma' = \Sigma^+$. The result will follow by adjusting the derivation in the answer. I'm not super familiar with the pseudoinverse + notation, but hopefully I'm using it correctly.

    – Smithey Jun 27 '20 at 02:10
3

First you need to assume that the matrix $A^*A$ is invertible. For which you need $n \leq m$ and rank($A$) is $n$.

So when $n \leq m$ and when rank($A$) is $n$, then the reduced SVD of $A$ is $A = U \Sigma V^*$ where $U \in \mathbb{R}^{m \times n}$, $\Sigma \in \mathbb{R}^{n \times n}$ and $V \in \mathbb{R}^{n \times n}$ such that $U^* U = I_{n \times n}$, $V^* V = I_{n \times n}$, $V V^* = I_{n \times n}$ and $\Sigma$ is a square diagonal matrix and has only (positive) real entries.

Note that $V^{-1}=V^*$.

Also note that $A^* = V \Sigma^* U^* = V \Sigma U^*$ since $\Sigma^* = \Sigma$.

Further note that if $M_1,M_2 \text{and} M_3$ are invertible matrices then $(M_1 M_2 M_3)^{-1} = M_3^{-1} M_2^{-1} M_1^{-1}$.

Use these to get the final answer.

jncraton
  • 103
  • 3
  • Given that the question is homework - which I realise only now - it would have probably been better to restrict to hints like you did. Sorry. – Rasmus Feb 01 '11 at 22:37
  • Hey Sivaram, I was in a hurry when I posted, so it was a (bad one) typo. Thanks for fixing the title! – paulochf Feb 02 '11 at 15:05
  • I knew all these hints but I missed your "further note" and did not invert the sequence inside those parentheses. Thanks for your patience! – paulochf Feb 02 '11 at 15:14
3

If the dimensions of A are m x n and $m\not\equiv n$ then there isn't any way of deriving $A^+ = U\Sigma^{-1}V$. The reason is because $\Sigma$ has the same dimensions as $A$ therefore it is not invertible. If you see any source about SVD you will see that the equation is $A = U_{mxm} \Sigma_{mxn}V^T_{nxn}$. If A is rectangular maybe the possible derivation you're looking for is $$\begin{align} A^+ &=(A^TA)^{-1}A^T\\ &=(V\Sigma^TU^T U\Sigma V^T)^{-1}V\Sigma^TU^T\\ &=(V\Sigma^T\Sigma V^T)^{-1}V\Sigma^TU^T \\ &=(V^T)^{-1}(\Sigma^T \Sigma)^{-1}V^{-1}V\Sigma^TU^T \\ &=V(\Sigma^T \Sigma)^{-1}\Sigma^TU^T \end{align}$$

  • 1
    The result still holds for non-square matrices (unless I made a mistake); see my comment above – Smithey Jun 27 '20 at 02:12
0

Let $\mathbb{K} \in \{\mathbb{R},\mathbb{C}\}$ be either the field of real numbers $\mathbb{R}$ or the field of complex numbers $\mathbb{C}$. There are multiple ways of defining the Moore-Penrose pseudo-inverse. We choose the following.

Definition (Singular value decomposition). A singular value decomposition of a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ is a matrix decomposition \begin{equation} \mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*, \end{equation} for two unitary matrices $\mathbf{U} \in \mathbb{K}^{m \times m}$ and $\mathbf{V} \in \mathbb{K}^{n \times n}$, and a rectangular diagonal matrix \begin{equation} \boldsymbol{\Sigma} = \mathbf{diag}(\sigma_1,\sigma_2,\ldots,\sigma_p), \qquad p = \min\{m,n\}, \end{equation} such that \begin{equation} \sigma_1 \geq \sigma_2 \geq \ldots \geq \sigma_p \geq 0. \end{equation} We call diagonal entries $(\sigma_i)_{i=1}^p$ of $\boldsymbol{\Sigma}$ the singular values, the columns $(\mathbf{u}_i)_{i=1}^m$ of $\mathbf{U}$ the left-singular vectors and the columns $(\mathbf{v}_i)_{i=1}^n$ of $\mathbf{V}$ the right-singular vectors of the singular value decomposition.

Remark. The singular value decomposition always exists, but is not unique.

Definition. (Moore-Penrose pseudo-inverse). For a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ with $\mathrm{rank}(\mathbf{A}) = r$ and a singular value decomposition $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$, the Moore-Penrose pseudo-inverse of $\mathbf{A}$ is the matrix \begin{equation} \mathbf{A}^\dagger := \mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \in \mathbb{K}^{n \times m}, \end{equation} where \begin{equation} \boldsymbol{\Sigma}^\dagger = \mathbf{diag}(1/\sigma_1,1/\sigma_2,\ldots,1/\sigma_r,0,\ldots,0) \in \mathbb{K}^{n \times m}. \end{equation} Note that the pseudo-inverse of a tall matrix is wide, and of a wide matrix is tall.

Proposition. For any tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the matrix $\mathbf{A}^*\mathbf{A}$ is invertible.

Proof. Firstly, notice that \begin{equation} \mathbf{x} \in \ker(\mathbf{A}) \implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{A}^*\mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}), \end{equation} and that \begin{align*} \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}) &\implies \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x}^* \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \Vert \mathbf{A} \mathbf{x} \Vert_2^2 = \mathbf{0} \\ &\implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}). \end{align*} Hence, $\ker(\mathbf{A}) = \ker(\mathbf{A}^* \mathbf{A})$, and by the rank nullity theorem, \begin{equation} \mathrm{rank}(\mathbf{A}) = n - \dim(\ker(\mathbf{A})) = n - \dim(\ker(\mathbf{A}^* \mathbf{A})) = \mathrm{rank}(\mathbf{A}^*\mathbf{A}). \end{equation} And in particular, as $\mathbf{A}^* \mathbf{A} \in \mathbb{K}^{n \times n}$ is a square full rank matrix, its is invertible. $\qquad \square$

Proposition. For a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the Moore-Penrose pseudo inverse can be represented as \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \end{equation}

Proof. Consider a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), and let $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$ be any singular value decomposition of $\mathbf{A}$. Notice that the rank is preserved under the application of unitary matrices, so $\mathrm{rank}(\mathbf{A}) = \mathrm{rank}(\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) = \mathrm{rank}(\boldsymbol{\Sigma})$. Hence, all singular values $(\sigma_i)_{i=1}^n$ on the diagonal of $\boldsymbol{\Sigma}$ are non-zero. Firstly, we show that the representation holds for $\mathbf{A} = \boldsymbol{\Sigma}$. We have \begin{equation} \boldsymbol{\Sigma}^* \boldsymbol{\Sigma} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} \begin{bmatrix} \sigma_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n \\ \vdots & \ddots & \vdots \\ 0 & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix}, \end{equation} so \begin{equation} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix} \begin{bmatrix} 1/\sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & 1/\sigma_n & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} = \boldsymbol{\Sigma}^*. \end{equation} Clearly, $\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}$ is invertible, and \begin{equation} \boldsymbol{\Sigma}^\dagger = (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma})^{-1}\boldsymbol{\Sigma}^*. \end{equation} Therefore, \begin{align*} (\mathbf{A}^* \mathbf{A}) \mathbf{A}^\dagger &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*)^* (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) (\mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^*) \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \underbrace{\mathbf{U}^* \mathbf{U}}_{=\mathbf{I}} \boldsymbol{\Sigma} \underbrace{\mathbf{V}^* \mathbf{V}}_{=\mathbf{I}} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \mathbf{U}^* \\ &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V})^* \\ &= \mathbf{A}^*. \end{align*} By the proposition above, $\mathbf{A}^* \mathbf{A}$ is also invertible, and we conclude that \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \qquad \square \end{equation}