2

When we have two given numbers $a$ and $b$, if $a$ is really close to $b$, when performing $a-b$, we lose many significant figures, and the relative error gets really big.

But why care about the relative error if the absolute error $e(a+b)$ is still less than $e(a)+e(b)$, so why do we say: "The error is large" even though it hasn't changed"?

Or we say "We lose many significant figures", but who cares if we're within an acceptable range from that number?

This confuses me, and every article I see repeats the same thing, "We lose significant figures", but Why is that important?

To quote Wikipedia: "The effect is that the number of significant digits in the result is reduced unacceptably". Why is it unacceptable?

pluton
  • 1,243

2 Answers2

5

Note: The majority of this text is extracted from my answer to a related question.


We consider general problem of subtracting two real numbers, i.e. $d = a - b$. This problem is ill conditioned, when $a \approx b$. In particular, if $\hat{a}$ and $\hat{b}$ are approximations of $a$ and $b$ respectively, i.e., $$ \hat{a} = a(1+\Theta_a), \quad \hat{b} = b(1 + \Theta_b)$$ and $\hat{d} = \hat{a} - \hat{b}$, then $$ \frac{d - \hat{d}}{d} = \frac{b\Theta_b - a \Theta_a}{a-b}.$$ It follows that the relative error is bounded by $$ \left| \frac{d - \hat{d}}{d} \right| \leq \frac{|a|+|b|}{|a-b|}\max\{|\Theta_a|, |\Theta_b| \}.$$ If $a \approx b$, then the right hand side can be large and there is no guarantee that $d$ is computed with a small relative error. Frequently, but not universally, $d$ will be computed with a large relative error. This is the phenomenon known as subtractive cancellation or catastrophic cancellation.

If the machine uses floating point arithmetic, then the very best we can hope for is to obtain the floating point representation of $a$ and $b$, i.e., $$\hat{a} = \text{fl}(a), \quad \hat{b} = \text{fl}(b).$$ In this case $$\max\{|\Theta_a|, |\Theta_b| \} \leq u,$$ where $u$ is the unit roundoff. In IEEE single precision arithmetic $u = 2^{-24}$. In IEEE double precision arithmetic $u=2^{-53}$.

On the other hand, if $|a| \ge 2|b|$ or if $|b| \ge 2|a|$, then $$ \frac{|a|+|b|}{|a-b|} \leq \frac{3}{2}.$$ This is an application of the triangle inequality. It follows that any subtraction $d = a - b$ causes at most a modest increase in the relative error if one operand is at least twice as big as the other.

EDIT: In response to a comment: If $a$ and $b$ are IEEE floating point numbers with $b/2 \leq a \leq 2b$ and if underflow is gradual, then the subtraction $d=a-b$ is exact, i.e, there is no rounding error. This is Sterbenz lemma.

  • The analysis should probably be slightly modified to account for the floating point version of the subtraction $\hat a−\hat b$, that is $\hat d=\mathrm{fl}(\hat a−\hat b)=(\hat a−\hat b)(1+\Theta_d)$ where $|\Theta_d|≤u$, which, I believe, does not affect the conclusion. – pluton Jan 24 '25 at 16:46
  • 1
    @pluton I have updated the answer with a statement of Sterbenz lemma. – Carl Christian Jan 25 '25 at 22:38
0

I finally got it myself: because, we lose significant figures when using $a-b$, in the succeeding calculations,we are limited to the smallest number of signifcant figures of all the numbers involved in the calculation, which makes all other number worse in precision,leading to error.

Correct me if I am wrong.

  • 1
    Yes, you make three mistakes: 1) You do not distinguish between real numbers and floating point numbers. This is crucial when seeking to understand catastrophic cancellation. 2) You do not entertain the possibility that remaining calculation can be exceeding well conditioned or can correct the error. The standard example of such a procedure is a fixed point iteration. 3) You do not distinguish between the accuracy of an general algorithm and precision which is the accuracy of basic arithmetic operations. – Carl Christian Apr 08 '19 at 19:42
  • @CarlChristian Thank you for that,can you perhaps clarify what you mean in an answer? I see that what I say is not general, so in general,how do we justify this claim that"The error goes up" as we do $a -b$ – ArminAshrafi Apr 09 '19 at 17:49
  • 1
    I have added a standard error analysis of subtraction. – Carl Christian Apr 09 '19 at 20:53