1

Let $D$ be a non-negative diagonal matrix with decreasing order in the diagonal, i.e. $D_{11}\geq D_{22}\geq\dots\geq 0$, And $X$ be an arbitrary square matrix with SVD decomposition, $X=U\Sigma V^{T}$, $U,V$ are unitary matrices, $\Sigma$ also has decreasing order in the diagonal, i.e. $\Sigma_{11}\geq \Sigma_{22} \geq\dots\geq 0$. Prove that $tr(DX)\leq tr(D\Sigma)$.

Could someone give me some hints on how to prove that?

Thank you!

cbyh
  • 545

1 Answers1

1

A hint would be to write everything down: write $U = (u_{i, j})$, $V = (v_{i, j})$, then write down $tr(DX)$ and $tr(D\Sigma)$.

Try to compare the two, and use the fact that the columns/rows of $U$ and $V$ form orthonormal bases.


Detailed solution:

I write the decomposition of $X$ as $U\Sigma V^*$, where $V^*$ denotes the conjugate transpose of $V$. I also write the matrices $D$ and $\Sigma$ as $D = diag(d_1, \dotsc, d_n)$ and $\Sigma = diag(\sigma_1, \dotsc, \sigma_n)$.

We first prove the case $U = V$, that is:

We have $tr(DU\Sigma U^*) \leq tr(D\Sigma)$ for any unitary matrix $U$.

Proof: Writing $U = (u_{i, j})$, the identity we want to prove is:$$\sum_{i, j = 1}^n d_i \sigma_j |u_{i, j}|^2 \leq \sum_{i = 1}^n d_i \sigma_i.$$

Since $U$ is unitary, we have $\sum_{i = 1}^n |u_{i, j}|^2 = 1$ for all $j$, and $\sum_{j = 1}^n |u_{i, j}|^2 = 1$ for all $i$. The result follows from the

Lemma: Let $(a_i)_{1 \leq i \leq n}, (b_i)_{1 \leq i \leq n}$ be two non-increasing sequences of non-negative real numbers, and let $(c_{i, j})_{1 \leq i, j \leq n}$ be a matrix of non-negative real numbers such that $\sum_{i = 1}^n c_{i, j} = 1$ for all $j$ and $\sum_{j = 1}^n c_{i, j} = 1$ for all $i$. Then we have $\sum_{i, j = 1}^n a_i b_j c_{i, j} \leq \sum_{i = 1}^n a_i b_i$.

The proof of the lemma is combinatorial and will be given in the end.


Now we may deduce the general case. For two matrices $A, B \in M_n(\mathbb C)$, define $$\langle A, B\rangle = tr(DA\Sigma B^*) = \sum_{i, j = 1}^n d_i \sigma_j a_{i, j} \overline{b_{i, j}}.$$ It is clear that $\langle \cdot, \cdot \rangle$ is a positive definite hermitian inner product on the $\mathbb C$-vector space $M_n(\mathbb C)$. In particular, we have Cauchy-Schwarz inequality: $\langle A, B\rangle^2 \leq \langle A, A\rangle \cdot \langle B, B\rangle$.

Putting $A = U$ and $B = V$, we get: $$tr(D U \Sigma V^*)^2 \leq tr(D U \Sigma U^*) \cdot tr(D V \Sigma V^*) \leq tr(D \Sigma)^2$$ whence the result.


It only remains to prove the lemma. The main ingredient is the rearrangement inequality, which says that, if $I \subset \{1, \dotsc, n\}$ is any (non-empty) subset and $\tau: I\rightarrow I$ is any permutation of $I$, then we have $\sum_{i \in I} a_i b_{\tau_i} \leq \sum_{i \in I} a_i b_i$.

We prove the lemma by doing adjustments on the matrix $(c_{i, j})$. When there are indices $i_1, i_2, \dotsc, i_k$, pairwisely different, such that all the numbers $c_{i_1, i_2}, c_{i_2, i_3}, \dotsc, c_{i_k, i_1}$ are strictly positive, we do the following adjustment: subtract the minimum of these numbers from each of them, and add this minimum to all the numbers $c_{i_1, i_1}, \dotsc, c_{i_k, i_k}$.

After this adjustment, the matrix still has all the properties stated in the lemma, while the corresponding value $\sum_{i, j = 1}^n a_i b_j c_{i, j}$ increases, by the rearrangement inequality.

I claim that: after finitely many steps of adjustment, we must arrive at a matrix which does not allow any valid adjustment, and that final matrix must be the identity matrix (i.e. $c_{i, i} = 1$ and $c_{i, j} = 0$ for $i \neq j$). This will finish the proof of the lemma.

Why is this true? We look at the number of non-zero entries $c_{i, j}$ with $i \neq j$. Each step of adjustment decreases that number by at least one, hence the procedure must terminate after at most $n^2 - n$ steps.

To show that the final matrix is the identity matrix, we define a directed graph $G$ whose vertices are $\{1, \dotsc, n\}$, and there is a directed edge from $i$ to $j$ if and only if $c_{i, j}$ is non-zero in the final matrix.

The graph $G$ then has the following two properties:

  1. There is no directed cycle in $G$. Otherwise, the existence of a cycle means there are $i_1, \dotsc, i_k$ such that all the numbers $c_{i_1, i_2}, \dotsc, c_{i_k, i_1}$ are non-zero, which then allows another step of adjustment.
  2. If a vertex has no outgoing edge, then it also has no incoming edge; and vice versa. This is because: if there is no edge from a given $i$ to any $j$, then we have $c_{i, j} = 0$ for all $j \neq i$, which implies $c_{i, i} = 1$ and hence all the $c_{j, i}$ must also be zero.

Finally, it's a simple graph problem that any finite directed graph $G$ satisfying the above two properties must have no edge at all. Reason: suppose $G$ has an edge $(i_0, i_1)$, then by the second property, there must be another edge $(i_1, i_2)$ (otherwise $i_1$ would have no outgoing edge but an incoming edge), and then another edge $(i_2, i_3)$, etc. Since $G$ is finite, the sequence $i_0, i_1, \dotsc$ will finally have repeated element, hence creating a directed cycle, a contradiction with the first property.

WhatsUp
  • 22,614