10

I feel confused about the uniqueness of the Moore-Penrose inverse generated from SVD. For any matrix $A$, if $X$ satisfied $$AXA=A, XAX=X, (AX)^\mathrm{T}=AX, (XA)^\mathrm{T}=XA $$then $X$ is called the Moore-Penrose inverse of $A$.

If $A$ has the SVD(singular value decomposition)$$A=P\left[\begin{matrix}\Lambda_r&0\\0&0\end{matrix}\right]Q^\mathrm{T}$$

then it is easy to prove that$$A^+ = Q\left[\begin{matrix}\Lambda_r^{-1}&0\\0&0\end{matrix}\right]P^\mathrm{T}$$ is a Moore-Penrose inverse.

If $X$ and $Y$ are both Moore-Penrose inverse of $A$, from the equation$$X=XAX=X(AX)^\mathrm{T}=XX^\mathrm{T}A^\mathrm{T}=XX^\mathrm{T}(AYA)^\mathrm{T}=X(AX)^\mathrm{T}(AY)^\mathrm{T}=(XAX)AY=XAY=(XA)^\mathrm{T}YAY=A^\mathrm{T}X^\mathrm{T}A^\mathrm{T}Y^\mathrm{T}Y=A^\mathrm{T}Y^\mathrm{T}Y=(YA)^\mathrm{T}Y=YAY=Y$$ we can see that the Moore-Penrose inverse is unique.

However, the Moore-Penrose inverse depends on the SVD and SVD is not unique. How to explain it?

bregg
  • 125
  • 2
    A nice conceptual definition of the pseudoinverse of a matrix (or operator) $A$ is that it is the linear transformation $L$ that takes a vector $b$ as input and returns as output the vector $x$ of least norm which satisfies $Ax = \hat b$, where $\hat b$ is the projection of $b$ onto the range of $A$. This linear transformation $L$ is perfectly well defined. If $A$ is a matrix, then the term "pseudoinverse" usually refers to the matrix representation of $L$ (with respect to the standard bases), rather than $L$ itself. Clearly the matrix representation of $L$ is unique. – littleO Dec 31 '18 at 12:14
  • 1
    @littleO I can understand why the pseudo inverse is unique. But I don’t know how to explain the uniqueness if the inverse is generated from SVD form since SVD is not unique. – bregg Dec 31 '18 at 12:28
  • 3
    Just because $Q,\Lambda_R,P$ are not unique does not imply that the product $A^+ = Q\left[\begin{matrix}\Lambda_r^{-1}&0\0&0\end{matrix}\right]P^\mathrm{T}$ is not unique – user619894 Dec 31 '18 at 14:04
  • The reason is similar to the fact that $1=2\cdot2^{-1}=3\cdot3^{-1}$. By the way, there are other methods for computing the pseudoinverse. If $A=BC$ is a full-rank decomposition, with $B$ left invertible and $C$ right invertible, then $A^+=C^T(CC^T)^{-1}(BB^T)^{-1}B^T$. – egreg Dec 31 '18 at 15:38
  • @egreg if $B$ is left but not right-invertible, then $BB^T$ fails to be invertible. Something is wrong with your formula – Ben Grossmann Dec 31 '18 at 19:18
  • @Omnomnomnom Sorry, typo for $B^TB$ – egreg Dec 31 '18 at 19:28

1 Answers1

5

The non-uniqueness of SVD can be characterized as follows: suppose that $A = P_0 \Sigma Q_0^T$ is one SVD of $A$. Moreover, suppose that the singular values of $A$ are $s_1$ with multiplicity $k_1$, $s_2$ with multiplicity $k_2$, and so forth, with $s_m = 0$ having multiplicity $k_m = n - r$. That is, we have $$ \Lambda_r = \pmatrix{s_1 I_{k_1} \\ & \ddots \\ && s_{m-1} I_{k_{m-1}}}, \quad \Sigma = \pmatrix{\Lambda_r \\ & 0_{k_m}} $$ Then $A = P\Sigma Q^T$ will be a singular value decomposition of $A$ if and only if there exists an orthogonal matrix $U$ such that $P = P_0 U, Q = Q_0U$, and $U$ is a block-diagonal orthogonal matrix of the form $$ U = \pmatrix{U^{(1)}\\ & \ddots \\ && U^{(m)}} $$ where $U^{(j)}$ is (orthogonal and) of size $k_j \times k_j$.


With that in mind: if you'd like to prove that the pseudoinverse as constructed from SVD is well-defined (that is, uniquely defined regardless of one's choice of SVD), then it suffices to show that for any choice of $U$ of the form prescribed above, we have $$ [Q_0U] \pmatrix{\Lambda_r^{-1} \\ & 0} [P_0U]^T = Q_0 \pmatrix{\Lambda_r^{-1} \\ & 0} P_0 $$ It is straightforward (but in my opinion tedious) to show that this holds if we use the block-structure of $\Lambda_r^{-1}$ and block-matrix multiplication.

Ben Grossmann
  • 234,171
  • 12
  • 184
  • 355
  • 1
    If $k_m > 0$ then setting $U^{(i)}$ to identity for all $i \in [1,m-1]$, $P=P_0U$ and $Q = Q_0$ (i.e. $Q$ is the same as $Q_0$) gives another form of svd of $A$ (because $U\Sigma = \Sigma$). This does not change the pseudoinverse because $\Sigma^\dagger U^T = \Sigma^\dagger$. – Dhruv Kohli Dec 31 '22 at 15:45