How to reduce numerical error of computing $\ln(x) - \ln(y)$ when $x \approx y$?

Question

$\newcommand{\Cond}{\operatorname{Cond}}$There will be a huge cancellation error computing $\ln(x)-\ln(y)$ for $x \approx y$.

One method for solving this issue is rewriting $\ln(x)-\ln(y)$ as $\log(\frac{x}{y})$ but the condition number for $f(u) = \ln(u)$ is infinte for $u = 1$ so it is not a well conditioned method.

$$\Delta x = h$$ $$\Cond(f,x) = \frac{\frac{\Delta f}{f}}{\frac{\Delta x}{x}} = \frac{\frac{f(x+h)-f(x)}{h} \times \frac{h}{f(x)}}{\frac{h}{x}} = \frac{x f'(x)}{f(x)}$$ $$ \Cond(\ln(x), x) = \frac{1}{\ln(x)}$$ $$\Cond(\ln(x), 1) = \infty$$

How can we reduce the error of computing $\ln(x) - \ln(y)$? Does rewriting it as $\ln(1+\frac{x-y}{y})$ help?

Unfortunately, there is also cancellation when computing $x-y$ — Hagen von Eitzen, Oct 23 '17 at 16:12
You might find this paper on divided differences to be of interest. — J. M. ain't a mathematician, Oct 23 '17 at 16:16
@hagen-von-eitzen: If $x \approx y$ or more generally if $y/2 \le x \le y,$ then according to the Sterbenz lemma the floating point difference will be exact (if no underflow occurs). — gammatester, Oct 23 '17 at 16:18

score 5 · Accepted Answer · answered Oct 23 '17 at 23:32

Suppose, that $x$, $y$ are exact, and $x\approx y$. Let $z$ denote $(x-y)/y$ and let $z^{*} = \text{fl}((x-y)/y)$ be the computed value of $z$. Since $x-y$ can be computed without any error (by the Sterbenz lemma as noted in a comment), $z^{*} = z\times(1+\alpha_1\delta_1)$, where $|\delta_1|\leq \varepsilon$, $\varepsilon$ is the machine precision, and $\alpha_1$ is the accuracy of floating point division, usually $\alpha_1 = 1/2$.

Let $\text{log1p}(x) = \text{ln}(1+x)$ and let $\overset{\sim}{\text{log1p}}(x)$ be the computed value of $\text{log1p}(x)$.

We have $\overset{\sim}{\text{log1p}}(z^{*}) = \text{log1p}(z^*)\times(1+\alpha_2\delta_2)$, where $|\delta_2|\leq \varepsilon$, and $\alpha_2$ is the accuracy of $\text{log1p}$ function.

Thus: $$\overset{\sim}{\text{log1p}}(z^{*}) = \text{log1p}((1+\alpha_1\delta_1)z)\times(1+\alpha_2\delta_2) = \text{log1p}(z)\times(1+\text{cond}(\text{log1p}, z)\alpha_1\delta_1)\times(1+\alpha_2\delta_2) + O(\varepsilon^2) = \text{log1p}(z)\times(1+\text{cond}(\text{log1p}, z)\alpha_1\delta_1+\alpha_2\delta_2) + O(\varepsilon^2)$$ where the condition number of the $\text{log1p}$ function is $$\text{cond}(\text{log1p}, z) = \frac{z}{\text{ln}(z + 1)\times(z + 1)}$$ and for $z\approx 0$, $\text{cond}(\text{log1p}, z)= 1 - \frac{1}{2}z + O(z^2) \approx 1$. This shows, that the relative forward error for $x\approx y$: $$\left|\frac{\overset{\sim}{\text{log1p}}(z^{*}) - \text{log1p}(z)}{\text{log1p}(z)}\right| \lesssim (\alpha_1 + \alpha_2)\varepsilon$$ is small and very close to the accuracy of $\text{log1p}$ function.

Notice, that you cannot use $\text{ln}(1+z)$ instead of $\text{log1p}(z)$, due to a huge condition number of $\text{ln}(1+z)$ for $z\approx 0$.

If $x$ or $y$ is not exact, then we must also take into account the condition number of $\text{ln}(x)- \text{ln}(y)$ function. As you observed, this condition number is very high for $x\approx y$. In this case it is impossible to obtain an accurate algorithm of computing this function, and the only solution is to use high precision arithmetics.

+1 for for considering exact and inexact input. Should the second to last paragraph not read "... due to the huge condition numbers of $ln(x)$ near $x=1$"? Kind regards — Carl Christian, Oct 24 '17 at 08:15

user14717 · Answer 2 · 2017-10-23T18:04:30.313

You are correct that the condition number of function evaluation at any zero of the function is infinite. This is equivalent to the fact that "no number is close to zero". However, this isn't really a problem in practice. For this reason, Gautschi has defined a "mollified condition number"

$$ \mathrm{cond}_{\mathrm{mol}}(f, x) := \frac{m(x)|f'(x)|}{m(f(x))} $$ where the mollifier $m$ is given by

$$ m(x) := \begin{cases} 1, & |x| < 1 \\ |x|, & \mathrm{otherwise} \end{cases} $$ It is easy to see that $\mathrm{cond}_{\mathrm{mol}}(\ln, 1) = 1$, so just use $\log(x/y)$, or perhaps $\mathrm{log1p}(x/y-1)$.

Here is Gautschi's full reasoning on introduction of this definition ($y=f(x)$):

If $x = 0$ and $y = 0$, one should consider an absolute perturbation of $x$ and a relative perturbation of $y$, and the other way around if $x = 0$ and $y = f(x) = 0$. The respective condition numbers then are $(\mathrm{cond} \, f)(x) = |f (x)/f(x)|$ and $(\mathrm{cond} \, f)(x) = |xf (x)|$. If $x = y = 0$, the appropriate condition number is the one in Definition 2.3. Since these modifications create discontinuities in the behavior of the condition number, it is preferable, and in fact quite natural, to use relative error only if the quantity in question is greater than 1 in modulus, and absolute error otherwise.

@bemoniri: We say $a$ is close to $b$ ($a \ne b$) iff $|a-b|/|b| \ll 1$. Take the limit $b\to 0$ and you see nothing is close to zero, except zero. — user14717, Oct 23 '17 at 20:02

How to reduce numerical error of computing $\ln(x) - \ln(y)$ when $x \approx y$?

2 Answers2