Classical and Modified Gram Schmidt are both unstable. If you read the text by Trefethen he described the difference between Householder and the first two as the following.
This is Classical and Modified Gram-Schmidt, described Triangular Orthogonalization
$A \underbrace{R_{1} , R_{2} \cdots R_{n}}_{\hat{R}^{-1}} = \hat{Q} \tag{1}$
Below we see Householder, Orthogonal Triangularization
$ \underbrace{Q_{1} , Q_{2} \cdots Q_{n}}_{\hat{Q}^{*}}A = R \tag{2}$
Why are these different?
The condition number of a triangular matrix can be anything so if you have a series of them then it can grow very large however orthogonal matrices have condition number $1$.
By changing the $\epsilon$ you change the condition number. If you actually realize $\epsilon$ is related to the singular values. The first one is nearly $1$.
import numpy as np
import math
eps = math.exp(1e-3)-1
A = np.matrix([[1 ,1,1],[eps, 0 ,0 ], [0 ,eps, 0], [0 , 0 ,eps ]])
u, s, vt = np.linalg.svd(A)
s
Out[12]: array([1.73205110e+00, 1.00050017e-03, 1.00050017e-03])
eps
Out[13]: 0.0010005001667083846
Due to orthogonalization it appears to be $\sqrt{3}$
Note that
$$ \kappa(A) = \frac{\sigma_{max}(A)}{\sigma_{min}(A)} = \frac{\sqrt{3}}{\epsilon} \tag{3}$$
Then you'd note that as $\epsilon \to 0$ $\kappa \to \infty$
Classical Gram Schmidt
The process of Gram Schmidt is the following for classical
$$ v_{j} = a_{j} - (q_{1}^{*}a_{j})q_{1} -(q_{2}^{*}a_{j})q_{2} - \cdots - (q_{j-1}^{*}a_{j})q_{j-1} \tag{3} $$
we can write this like this
$$ q_{1} = \frac{a_{1}}{r_{11}} \tag{4} $$
$$ q_{2} = \frac{a_{2} - r_{12}q_{1}}{r_{22}} \tag{5} $$
$$ q_{3} = \frac{ a_{3} - r_{13} q_{1}- r_{23}q_{2} }{r_{33}} \tag{6} $$
$$ q_{n} = \frac{a_{n} - \sum_{i=1}^{n-1} r_{in} q_{i} }{r_{nn} } \tag{7} $$
Now here is Modified Gram Schmidt. To begin we introduce orthogonal projections
Modified Gram Schmidt
$$ q_{1} = \frac{P_{1}a_{1}}{\| P_{1}a_{1}\|}, q_{2} = \frac{P_{2}a_{2}}{\| P_{2}a_{2}\|}, \cdots , q_{n} = \frac{P_{n}a_{n}}{\| P_{n}a_{n}\|} \tag{8}$$
More specifically $P_{j}$ is the an orthogonal projector. $P_{j}$ is the $m \times m$ matrix of rank $m -(j-1)$ that projects $\mathbb{C}^{m}$ onto the space to $\langle q_{1}, \cdots , q_{j-1} \rangle $
The projector $P_{j}$ can be represented explicitly. Here we represent $\hat{Q}_{j-1}$ as the $m \times (j-1)$ matrix containing the columns of the orhtogonal projection. I.e
$$ P_{j} = I - \hat{Q}_{j-1}\hat{Q}_{j-1}^{*} \tag{9}$$
then we get
$$ v_{j} = P_{j}a_{j} \tag{10} $$
So how is this more stable?
One more note
Your matrix is famous. It is called the Lauchli matrix
In Both CGS and MGS
where $ 1 + \epsilon^{2} =1$
$$v_{1} \to (1 , \epsilon, 0, 0) \tag{11} $$
$$ r_{11} = \sqrt{1 + \epsilon^{2} } \approx 1 \tag{12} $$
$$ q_{1} = \frac{v_{1}}{r_{11}} = (1 , \epsilon, 0, 0)\tag{13} $$
$$ v_{2} = (1,0,\epsilon,0) \tag{14} $$
$$ r_{12} = q_{1}^{T}a_{2} = q_{1}^{T}v_{2} = 1 \tag{15} $$
$$ v_{2} = v_{2} - r_{12}q_{1} = (0,-\epsilon, \epsilon,0) \tag{16} $$
$$ r_{22} = \sqrt{2}\epsilon \tag{17} $$
$$ q_{2} = (0,\frac{-1}{\sqrt{2}},\frac{1}{\sqrt{2}},0) \tag{18} $$
$$ v_{3} = (1,0,0,\epsilon) \tag{19} $$
$$ r_{13} = q_{1}^{t}v_{3} = 1 \tag{20} $$
$$ v_{3} = v_{3} - r_{13}q_{1} = (0,-\epsilon,0,\epsilon) \tag{21} $$
For CGS
$$ r_{23} = q_{2}^{T}a_{3} =0 \tag{22} $$
$$ v_{3} = v_{3} - r_{23}q_{2} = (0,-\epsilon,0,\epsilon) \tag{23} $$
$$ r_{33} = \sqrt{2} \epsilon \tag{24} $$
$$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{2}} ,0\frac{1}{\sqrt{2}} ) \tag{25} $$
For MGS
$$ r_{23} = q_{2}^{T}v_{3} =\frac{\epsilon}{\sqrt{2}} \tag{26} $$
$$ v_{3} = v_{3} - r_{23}q_{2} = (0,\frac{-\epsilon}{2},\frac{-\epsilon}{2}, \epsilon ) \tag{27} $$
$$ r_{33} = \frac{\sqrt{6}}{\epsilon 2} \tag{28} $$
$$ q_{3} = \frac{v_{3}}{r_{33}} = (0,\frac{-1}{\sqrt{6}} ,\frac{1}{\sqrt{6}},\frac{2}{\sqrt{6}} ) \tag{29} $$