Proof of Gersgorin Discs Theorem

Question

I have a question about the proof of Gersgorin Theorem from the book Matrix Analysis by Horn & Johnson.

The Theorem states that for any $A\in \mathbb{C}^{n \times n}$

1) all eigenvalues are contained in the union of the following $n$ disks: $$ \bigcup_{i=1}^n \{ z \in \mathbb{C} \mid |z-a_{ii}| \leq r_i(A) \} $$ where $r_i(A)=\sum_{j=1, j\neq i}^n |a_{ij}|$.

2) If one has $k$ connected discs, which are disjoint to the remaining $n-k$ disks, then this area holds exactly $k$ eigenvalues.

The proof of 1) is clear to me. But I struggle with proof of 2).

The proof of 2) goes as follows:

Consider $A_\varepsilon = D +\varepsilon B$ where $D = \mathrm{diag}(a_{11},\dots,a_{nn})$ and $A_1 = A$. Suppose the first $k$ discs $$ \bigcup_{i=1}^k \{ z \in \mathbb{C} \mid |z-a_{ii}| \leq r_i(A) \} $$ are connected. In the book the following is now said:

For each $i=1\dots,k$, consider the eigenvalues $\lambda_i(A_0)=a_{ii}$ and $\lambda_i(A_\varepsilon)$.

Here is my first question: How are the eigenvalues $\lambda_i(A_\varepsilon)$ characterized? Since they could be complex I do not see a natural order. I guess he means the eigenvalue which is closest to $a_{ii}$ but I don't see that this is proper definition, because there could be distinct eigenvalues which have the same distance to $a_{ii}$.

Even worse, these eigenvalues have suddenly the following property

$$\lambda_i(A_\varepsilon) \in G_k(\varepsilon) = \bigcup_{i=1}^k \{ z \in \mathbb{C} \mid |z-a_{ii}| \leq r_i(A) \epsilon \} .$$

Why is that true (Obviously I can't see that since I did not even understood the choice of $\lambda_i(A_\varepsilon)$).

Thanks for any help!

score 3 · Accepted Answer · edited Apr 13 '17 at 12:20

Horn and Johnson’s proof of the second assertion is correct, but perhaps parts of it could have been structured a little more clearly.

Let $A_\epsilon = D+\epsilon B$ as you and Horn and Johnson defined it. Each $\lambda _i \left( {A_\varepsilon} \right)$ represents a continuous function parameterized by the variable $\epsilon$, with initial value $\lambda _i \left( {A_0 } \right) = a_{ii}$ and terminal value $\lambda _i \left( A \right).$ We know that this function is continuous, since the eigenvalues of a matrix are continuous functions of matrix’s characteristic polynomial which is in turn a continuous function of the matrix entries. A rigorous expression for this result is given, for example, in the answer to the question

Eigenvalues are continuous?

So, $\lambda _i \left( {A_\varepsilon} \right)$ traces a continuous path from $\lambda _i \left( {A_0 } \right) = a_{ii}$ to $\lambda _i \left( {A_1 } \right) = \lambda _i \left( A \right)$ as $\epsilon$ varies from $0$ to $1$.

Horn and Johnson define $G_k \left( \varepsilon \right) \equiv \bigcup\limits_{i = 1}^k {\left\{ {z \in C:\left| {z - a_{ii} } \right| \le R_i ^\prime \left( {A_\varepsilon } \right)} \right\}}$ where $R_i ^\prime \left( A \right) \equiv \sum\limits_{j = 1 \atop j \ne i }^n {\left| {a_{ij} } \right|}$. Since $R_i ^\prime \left( {A_\varepsilon } \right) = \varepsilon R_i ^\prime \left( A \right)$, the single disk $\left\{ {z \in C:\left| {z - a_{ii} } \right| \le R_i ^\prime \left( {A_\varepsilon } \right)} \right\}$ can be rewritten as $\left\{ {z \in C:\left| {z - a_{ii} } \right| \le \varepsilon R_i ^\prime \left( A \right)} \right\}$. So we get the expression for $G_k \left( \varepsilon \right)$ that you wrote down, and we see that $G_k \left( \varepsilon \right) \subseteq G_k \left( 1 \right)$ for all $\epsilon \in [0,1]$.

For a fixed $i$, since the center of the discs $\left\{ {z \in C:\left| {z - a_{ii} } \right| \le \varepsilon R_i ^\prime \left( A \right)} \right\}$ (namely $\lambda _i \left( {A_0 } \right) = a_{ii}$) is the same for all $\epsilon\in [0,1]$ and each disc expands as $\epsilon$ increases we have that $\left\{ {\lambda _i \left( {A_0 } \right)|1 \le i \le k} \right\} \in G_k \left( \varepsilon \right) \subseteq G_k \left( 1 \right)$. By hypothesis, the connected set $G_k \left( 1 \right)$ is disjoint from the union of the other $n-k$ discs, and Horn and Johnson refer to this union of the other discs as $G_n^c \left( 1 \right) \equiv G_n \left( 1 \right)\backslash G_k \left( 1 \right)$. Since $G_n^c \left( 1 \right)$ is disjoint from $G_k \left( 1 \right)$ and the discs contract as $\epsilon$ decreases, we have $G_n \left( \varepsilon \right)\backslash G_k \left( \varepsilon \right) \subseteq G_n^c \left( 1 \right)$.

Fix an $i$ between $1$ and $k$ inclusive, and look at the continuous trajectory traced by $\lambda _i \left( {A_\varepsilon} \right)$ as it moves from $\lambda _i \left( {A_0 } \right)$ to $\lambda _i \left(A \right)$. From the first part of the theorem, we know that either $\lambda _i \left( {A_\varepsilon } \right) \in G_k \left( \varepsilon \right) \subseteq G_k \left( 1 \right)$ or $\lambda _i \left( {A_\varepsilon } \right) \in G_n \left( \varepsilon \right)\backslash G_k \left( \varepsilon \right) \subseteq G_n^c \left( 1 \right)$. If the latter was true, the path traced by $\lambda _i \left( {A_\varepsilon} \right)$ as $\epsilon$ increased would have had to have crossed the nonzero metric separation between $G_k \left( 1 \right)$ and $G_n^c \left( 1 \right)$ and this violates the continuity of $\lambda _i \left( {A_\varepsilon} \right)$. So, we've confirmed Horn and Johnson’s statement that $\lambda _i \left( {A_\varepsilon } \right) \in G_k \left( \varepsilon \right)$ for all $\epsilon\in [0,1]$. Moreover, this shows that the $k$ eigenvalues $\left\{ {\lambda _i \left( A \right)|1 \le i \le k} \right\}$ lie within $G_k \left( 1 \right)$ and, as Horn and Johnson note, the remaining eigenvalues must lie in $G_n^c \left( 1 \right)$ by the same sort of argument.

Oh my gosh. Thank you. I actually stopped reading the book because of this two years ago. I might now give it a try again. Thank you! — Adam, Mar 27 '16 at 20:50

mathreadler · Answer 2 · 2016-03-27T19:38:03.663

Here is a sketch of a proof in a very non-technical language. I think this is the essence of the proof in the book, but I don't have it around right now. It is probably related to Mark's answer above which is much more detailed.

Diagonal matrices have Gersgorin disks of radius 0.
We can continuously vary the elements of any matrix to shrink it to become a diagonal matrix preserving identical diagonal but all other elements zero.
When doing this reverted (starting with the radius 0 diagonal matrix) the eigenvalues will vary continuously, and therefore can't escape their disc as it's radius starts increasing unless circles start overlapping.

Proof of Gersgorin Discs Theorem

2 Answers2