10

For an $m \times n$ matrix, $A$, the nuclear norm of $A$ is defined as $\sum_{i}\sigma_{i}(A)$ where $\sigma_{i}(A)$ is the $i^{th}$ singular value of $A$. I've read that the nuclear norm is convex on the set of $m \times n$ matrices. I don't see how this true and can't find a proof online.

Mykie
  • 7,245

2 Answers2

27

It is sufficient to prove that the nuclear norm is, in fact, a norm. It's trivial to verify that $\|A\|=0$ only if $A=0$, and that $\|tA\|=|t|\|A\|$ if $t$ is a scalar. The one non-trivial requirement is that the norm satisfies the triangle inequality; that is, $$\|A+B\|\leq \|A\|+\|B\|.$$ To do that, we're going to prove this: $$\sup_{\sigma_1(Q)\leq 1} \langle Q, A \rangle = \sup_{\sigma_1(Q)\leq 1} \mathop{\textrm{Tr}}(Q^HA) = \sum_i \sigma_i(A) = \|A\|.$$ Since $\sigma_1(\cdot)$ is itself a norm, what we're actually proving here is that the nuclear norm is dual to the spectral norm.

Let $A=U\Sigma V^H=\sum_i \sigma_i u_i v_i^H$ be the singular value decomposition of $A$, and define $Q=UV^H=UIV^H$. Then $\sigma_1(Q)=1$ by construction, and $$\langle Q, A \rangle = \langle UV^H, U\Sigma V^H \rangle = \mathop{\textrm{Tr}}(VU^HU\Sigma V^H) = \mathop{\textrm{Tr}}(V^HVU^HU\Sigma) = \mathop{\textrm{Tr}}(\Sigma) = \sum_i \sigma_i.$$ (Note our use of the identity $\mathop{\textrm{Tr}}(ABC)=\mathop{\textrm{Tr}}(CAB)$; this is always true when both multiplications are well-posed.) So we have established that $\sup_{\sigma_1(Q)\leq 1} \langle Q, A \rangle \geq \sum_i \sigma_i(A)$. Now let's prove the other direction: $$\sup_{\sigma_1(Q)\leq 1} \langle Q, A \rangle = \sup_{\sigma_1(Q)\leq 1} \mathop{\textrm{Tr}}(Q^HU\Sigma V^H) = \sup_{\sigma_1(Q)\leq 1} \mathop{\textrm{Tr}}(V^HQ^HU\Sigma) = \sup_{\sigma_1(Q)\leq 1} \langle U^HQV, \Sigma \rangle = \sup_{\sigma_1(Q)\leq 1} \sum_{i=1}^n \sigma_i (U^HQV)_{ii} = \sup_{\sigma_1(Q)\leq 1} \sum_{i=1}^n \sigma_i u_i^H Q v_i \leq \sup_{\sigma_1(Q)\leq 1} \sum_{i=1}^n \sigma_i \sigma_\max(Q) = \sum_{i=1}^n \sigma_i. $$ We have proven both the $\leq$ and $\geq$ cases, so equality is confirmed.

Why did we go through all of this trouble? To make proving the triangle inequality easy: $$\|A+B\|=\sup_{V:\sigma_\max(V)\leq 1} \langle V, A+B \rangle \leq \sup_{V:\sigma_\max(V)\leq 1} \langle V, A \rangle + \sup_{V:\sigma_\max(V)\leq 1} \langle V, B\rangle = \|A\| + \|B\|.$$

Michael Grant
  • 20,110
  • Can you tell me, in which book I can find these proofs? – ponir Oct 07 '18 at 01:43
  • 1
    I could not. I came up with this on my own, without references. That is not to suggest that it is original to me, just that I couldn't tell you where I obtained the necessary knowledge. – Michael Grant Oct 07 '18 at 14:40
  • Thank you for the answer. In that case, could you please explain how did you derive this: $\sup_{V:\sigma_\max(V)\leq 1} \langle V, A+B \rangle \leq \sup_{V:\sigma_\max(V)\leq 1} \langle V, A \rangle + \sup_{V:\sigma_\max(V)\leq 1} \langle V, B\rangle$ – ponir Oct 09 '18 at 02:04
  • @Michael Grant : In the first part, I understand taking the $\sup$ has to be done to get the result but $Q=UV^H$ is a unique matrix. Over what you are taking the $\sup$? –  Dec 25 '18 at 20:28
  • @Saeed you're right, $Q=UV^H$ is a single value from among all of the possible values of $Q$ satisfying $\sigma(Q)\leq 1$. By computing the value for this single point, it establishes a lower bound for the supremum. After all, it is universally true that $\sup_x f(x) \geq f(x_0)$ for any fixed $x_0$. – Michael Grant Dec 26 '18 at 21:12
3

Any norm is convex. If $0 \leq \theta \leq 1$, then $\| \theta x + (1 - \theta) y \| \leq \|\theta x \| + \| (1-\theta) y \| = \theta \|x\| + (1-\theta) \| y\|$.

littleO
  • 54,048