I want to numerically compute the smallest eigenvalue of an $n \times n$ positive definite matrix $\bf{A}$.
Why I want to avoid working with $\bf{A^{-1}}$
I know that I can apply power iteration to $\bf{A^{-1}}$, but I wish to avoid working with this inverse matrix.
In my application, $\bf{A}$ is implicitly expressed as $\bf{A} = \bf{K'K}$ with $\bf{K} = \bf{L^{-1}R}$, where
- $\bf{L}$ is an $m \times m$ lower triangular band matrix;
- $\bf{R}$ is an $m \times n$ band matrix of full column rank $(m > n)$.
While $\bf{L}$ and $\bf{R}$ are sparse, $\bf{K}$ and $\bf{A}$ are fully dense. Given that both $m$ and $n$ can be large, I want to explicitly form neither $\bf{K}$ nor $\bf{A}$.
A bad luck is that $m \neq n$, so that $\bf{R}$ is not square and there is no convenient factor form as $\bf{A^{-1}} = R^{-1}LL'R^{-1\prime}$.
What I have tried
With $\bf{A}$ structured as above, it is computationally efficient to compute $\bf{Av}$ for any vector $\bf{v}$ using sparse linear algebra routines. It is easy to compute $\bf{A}$'s largest eigenvalue $\mu$ using power iteration:
- $\bf{v_0} = (\frac{1}{n}, \ldots, \frac{1}{n})$;
- $\bf{u_0} = \bf{Av_0}$;
- $\lambda_0 = \bf{v_0'u_0}$;
- for $k = 1, 2, \ldots$ till convergence of $\{\lambda_k\}$
- $\bf{v_k} = \bf{u_{k - 1}} / \|\bf{u_{k - 1}}\|$;
- $\bf{u_k} = \bf{Av_k}$;
- $\lambda_k = \bf{v_k'u_k}$;
- return $\lambda_k$.
To find the smallest eigenvalue, I apply this algorithm to $\bf{B} = \bf{A - \mu\bf{I}}$, inspired by this thread: https://math.stackexchange.com/a/271876/407465. Here is an R program to implement this method.
PowerIter <- function (A, mu = 0) {
n <- nrow(A)
## spectral shift
A <- A - diag(mu, n)
## power iteration
v.old <- rep.int(1 / n, n)
u.old <- A %*% v.old
d.old <- sum(v.old, u.old)
k <- 0L
repeat {
v.new <- u.old * (1 / sqrt(sum(u.old ^ 2)))
u.new <- A %*% v.new
d.new <- sum(v.new * u.new)
if (abs(d.new - d.old) < abs(d.old) * 1e-8) break ## test relative error
d.old <- d.new
u.old <- u.new
k <- k + 1L
}
cat("convergence after", k, "iterations.\n")
d.new
}
What goes wrong?
I found that this suggested algorithm (via spectral shift) does not seem numerically stable. I composed two toy examples, one being successful, the other being problematic.
A successful example
\begin{equation} \bf{A} = \begin{pmatrix} 2 & -1 & 0\\ -1 & 2 & -1\\ 0 & -1 & 2 \end{pmatrix} \end{equation}
With $|\bf{A} - \lambda\bf{I}| = (2 - \lambda)[(2 - \lambda)^2 - 2]$, the three eigenvalues are $2 + \sqrt{2} \approx 3.4142136$, $2$ and $2 - \sqrt{2} \approx 0.5857864$.
The algorithm is a success for this matrix. Here is the output of an R session.
A <- matrix(c(2, -1, 0, -1, 2, -1, 0, -1, 2), nrow = 3)
d.max <- PowerIter(A)
convergence after 7 iterations.
[1] 3.414214
d.min <- PowerIter(A, d.max) + d.max
convergence after 1 iterations.
[1] 0.5857864
A problematic example
Now let's construct $\bf{A} = \bf{Q'DQ}$, where $\bf{D} = \textrm{diag}(d_1, d_2, d_3)$ are eigenvalues and $\bf{Q}$ is a rotation matrix
\begin{equation} \bf{Q} = \begin{pmatrix} 0.36 & 0.48 & -0.8\\ -0.8 & 0.6 & 0\\ 0.48 & 0.64 & 0.6 \end{pmatrix} \end{equation}
I fixed $d_1 = 1$, $d_2 = 0.01$ and tried three choices for $d_3$: $10^{-3}$, $10^{-5}$ and $10^{-7}$. Here is the output of an R session.
test <- function (d) {
Q <- matrix(c(0.36, -0.8, 0.48, 0.48, 0.6, 0.64, -0.8, 0, 0.6), nrow = 3)
A <- crossprod(Q, d * Q)
d.max <- PowerIter(A)
d.min <- PowerIter(A, d.max) + d.max
c(min = d.min, max = d.max)
}
test(c(1, 1e-2, 1e-3))
convergence after 3 iterations.
convergence after 298 iterations.
min max
0.001000543 1.000000000
test(c(1, 1e-2, 1e-5))
convergence after 3 iterations.
convergence after 279 iterations.
min max
1.04883e-05 1.00000e+00
test(c(1, 1e-2, 1e-7))
convergence after 3 iterations.
convergence after 279 iterations.
min max
5.860836e-07 1.000000e+00
Computation of the smallest eigenvalue is slow and becomes increasingly inaccurate as $\bf{A}$ gets less well conditioned (but it is still far from being ill-conditioned!).
My questions
- Is there a mathematical justification for such observation?
- How can I modify this algorithm for better numerical stability?
- If the algorithm can not be improved, can someone suggest another efficient algorithm?
- If no better algorithm exists, any lower bound for the smallest eigenvalue? Gershgorin Circle Theorem is not good as it is too loose a bound.