Classical Gram-Schmidt for matrix $A$

Question

Let $$A=\begin{bmatrix}1 & 1 & 1\\ \epsilon & 0 & 0 \\ 0\ & \epsilon & 0 \\ 0 & 0 & \epsilon \end{bmatrix}.$$ On this page, this matrix $A$ is used to show the instability of the classical Gram-Schmidt algorithm, using the criterion that $1+\epsilon =1$. Furthermore, it can be shown that the output vectors from classical GS for $A$ are not orthogonal to each other.

It seems that many websites briefly seem to only talk about the algorithm's shortcomings when running it on a computer. Is there any more "general" reasoning as to why the classical GS algorithm doesn't always produce orthonormal vectors, even "on paper"?

Is it because classical GS (in this case) does not account well for the approximation $\epsilon+1=1$? Would someone be able to explain this a little more in depth?

Thanks

Do the same calculation without the approximation $\epsilon << 1$. You should see that you do get orthogonal vectors. The problem is that you use that approximation repeatedly. For example $o_3$ uses the approximate results $b_1$ and $b_2$ which causes the failure from approximating to build up too much. — AHusain, Sep 29 '18 at 23:15

score 5 · Accepted Answer · edited Jun 12 '20 at 10:38

Classical and Modified Gram Schmidt are both unstable. If you read the text by Trefethen he described the difference between Householder and the first two as the following.

This is Classical and Modified Gram-Schmidt, described Triangular Orthogonalization $A \underbrace{R_{1} , R_{2} \cdots R_{n}}_{\hat{R}^{-1}} = \hat{Q} \tag{1}$

Below we see Householder, Orthogonal Triangularization

$ \underbrace{Q_{1} , Q_{2} \cdots Q_{n}}_{\hat{Q}^{*}}A = R \tag{2}$

Why are these different?

The condition number of a triangular matrix can be anything so if you have a series of them then it can grow very large however orthogonal matrices have condition number $1$.

By changing the $\epsilon$ you change the condition number. If you actually realize $\epsilon$ is related to the singular values. The first one is nearly $1$.

import numpy as np
import math
eps = math.exp(1e-3)-1
A = np.matrix([[1 ,1,1],[eps, 0 ,0 ], [0 ,eps, 0], [0 , 0 ,eps ]])
u, s, vt = np.linalg.svd(A)

s

Out[12]: array([1.73205110e+00, 1.00050017e-03, 1.00050017e-03])
eps
Out[13]: 0.0010005001667083846

Due to orthogonalization it appears to be $\sqrt{3}$

Note that

$$ \kappa(A) = \frac{\sigma_{max}(A)}{\sigma_{min}(A)} = \frac{\sqrt{3}}{\epsilon} \tag{3}$$

Then you'd note that as $\epsilon \to 0$ $\kappa \to \infty$

Classical Gram Schmidt

The process of Gram Schmidt is the following for classical

$$ v_{j} = a_{j} - (q_{1}^{*}a_{j})q_{1} -(q_{2}^{*}a_{j})q_{2} - \cdots - (q_{j-1}^{*}a_{j})q_{j-1} \tag{3} $$

we can write this like this

$$ q_{1} = \frac{a_{1}}{r_{11}} \tag{4} $$

$$ q_{2} = \frac{a_{2} - r_{12}q_{1}}{r_{22}} \tag{5} $$

$$ q_{3} = \frac{ a_{3} - r_{13} q_{1}- r_{23}q_{2} }{r_{33}} \tag{6} $$ $$ q_{n} = \frac{a_{n} - \sum_{i=1}^{n-1} r_{in} q_{i} }{r_{nn} } \tag{7} $$

Now here is Modified Gram Schmidt. To begin we introduce orthogonal projections

Modified Gram Schmidt

$$ q_{1} = \frac{P_{1}a_{1}}{\| P_{1}a_{1}\|}, q_{2} = \frac{P_{2}a_{2}}{\| P_{2}a_{2}\|}, \cdots , q_{n} = \frac{P_{n}a_{n}}{\| P_{n}a_{n}\|} \tag{8}$$

More specifically $P_{j}$ is the an orthogonal projector. $P_{j}$ is the $m \times m$ matrix of rank $m -(j-1)$ that projects $\mathbb{C}^{m}$ onto the space to $\langle q_{1}, \cdots , q_{j-1} \rangle $

The projector $P_{j}$ can be represented explicitly. Here we represent $\hat{Q}_{j-1}$ as the $m \times (j-1)$ matrix containing the columns of the orhtogonal projection. I.e

$$ P_{j} = I - \hat{Q}_{j-1}\hat{Q}_{j-1}^{*} \tag{9}$$

then we get

$$ v_{j} = P_{j}a_{j} \tag{10} $$

So how is this more stable?

One more note

Your matrix is famous. It is called the Lauchli matrix

In Both CGS and MGS

where $ 1 + \epsilon^{2} =1$

$$v_{1} \to (1 , \epsilon, 0, 0) \tag{11} $$

$$ r_{11} = \sqrt{1 + \epsilon^{2} } \approx 1 \tag{12} $$

$$ q_{1} = \frac{v_{1}}{r_{11}} = (1 , \epsilon, 0, 0)\tag{13} $$ $$ v_{2} = (1,0,\epsilon,0) \tag{14} $$ $$ r_{12} = q_{1}^{T}a_{2} = q_{1}^{T}v_{2} = 1 \tag{15} $$ $$ v_{2} = v_{2} - r_{12}q_{1} = (0,-\epsilon, \epsilon,0) \tag{16} $$ $$ r_{22} = \sqrt{2}\epsilon \tag{17} $$ $$ q_{2} = (0,\frac{-1}{\sqrt{2}},\frac{1}{\sqrt{2}},0) \tag{18} $$

$$ v_{3} = (1,0,0,\epsilon) \tag{19} $$ $$ r_{13} = q_{1}^{t}v_{3} = 1 \tag{20} $$ $$ v_{3} = v_{3} - r_{13}q_{1} = (0,-\epsilon,0,\epsilon) \tag{21} $$

For CGS

$$ r_{23} = q_{2}^{T}a_{3} =0 \tag{22} $$ $$ v_{3} = v_{3} - r_{23}q_{2} = (0,-\epsilon,0,\epsilon) \tag{23} $$

$$ r_{33} = \sqrt{2} \epsilon \tag{24} $$ $$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{2}} ,0\frac{1}{\sqrt{2}} ) \tag{25} $$

For MGS

$$ r_{23} = q_{2}^{T}v_{3} =\frac{\epsilon}{\sqrt{2}} \tag{26} $$ $$ v_{3} = v_{3} - r_{23}q_{2} = (0,\frac{-\epsilon}{2},\frac{-\epsilon}{2}, \epsilon ) \tag{27} $$

$$ r_{33} = \frac{\sqrt{6}}{\epsilon 2} \tag{28} $$ $$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{6}} ,\frac{1}{\sqrt{6}},\frac{2}{\sqrt{6}} ) \tag{29} $$

So how is this more stable? Is this a (rhetorical) question? Yes, MGS is more stable than CGS. We also know why. — Algebraic Pavel, Oct 01 '18 at 08:35
Not only that I wonder how this post answers the question but I have no idea how your reaction relates to my comment. — Algebraic Pavel, Oct 04 '18 at 22:09
I have no idea why you are so offensive. You probably did not read/understand what my first comment related to. Second, I don’t understand what did you try to achieve with the last edit. — Algebraic Pavel, Oct 04 '18 at 22:51
because you're taking issue with my answer by pointing to a paywalled book when it is your area of expertise after it already was upvoted and you didn't even write anything. All I did it put there as a place holder before I went to bed because I wanted to put some code from my graduate class but I no longer have a matlab license. Surely you know why, but the question asker doesn't. If you want to explain it. go ahead be my guest. But point out the flaw in the answer.
However don't point to a $40 book. — , Oct 04 '18 at 22:56
If you want me to point out a flaw in your answer, I did it in my first comment. — Algebraic Pavel, Oct 04 '18 at 23:01
you pointed to a journal that shows an error analysis of CGS where I have to buy $40 article. I have a link in the top that has the lauchli matrix the questioner actually is referencing ... — , Oct 04 '18 at 23:07
Ok so you still did not get it. So my question is: why do you think MGS is not more stable than CGS? — Algebraic Pavel, Oct 04 '18 at 23:13
I didn't say. I said they are both unstable. If you draw a graph of the normalized error when you performed the QR decomp for back sub it isn't great. only householder is good. — , Oct 04 '18 at 23:19
Sorry I don’t l know what the “normalized error” means without saying what it is supposed to measure and what back sub is. Also, MGS is certainly more stable than MGS. — Algebraic Pavel, Oct 04 '18 at 23:24
$ QRx=b $ when you solve . You have $Rx=Q^{-}b$ you perform back sub for x with R. The relative error is $ \frac{| \hat{x} - x |}{| x|}$ — , Oct 04 '18 at 23:45