29

I am trying to get the derivative of $|x|$, and I want that derivative function, say $g(x)$, to be a function of x.

So it really needs the |x| to be smooth (ex. $x^2$); I am wondering what is the best way to approximate |x| with something smooth?

I propose $\sqrt{x^2 + \epsilon}$, where $\epsilon = 10^{-10}$, but there should be something better? Perhaps Taylor expansion?

Sorry for any confusion. I should add some additional information here:

I want to use $|x|$ as part of an object function $J(x)$ which I want to minimize. So it would be nice to approximate $|x|$ with some smooth function so that I can get the analytic form of the first-order derivative of $J(x)$.

Thanks a lot.

Ono
  • 529
  • $|x|$ is a particularly easy function. It is smooth with the exception of $x=0$ For $x>0$ it's derivativ is $+1$, for $x<0$ it is $-1$, and in $x=0$ it is not defined. – Thomas Mar 26 '14 at 19:47
  • 4
    @Thomas But "in $x = 0$ it is not defined" is exactly what Ono wants to fix by approximating $|x|$. That is what this question is about. It's not "how to differentiate $|x|$?", it's "How to 'fix' the non-differentiability of $|x|$ at $x = 0$ by approximation?" – Arthur Mar 26 '14 at 19:53
  • @Arthur There is no fix. A function which is not differentiable cannot be made differentiable by approximation. – Thomas Mar 26 '14 at 19:54
  • 2
    @Thomas Yes, it can. You "smooth out" the corner. The function $\sqrt{x^2 + \epsilon}$ is one way to do just that. Another way would be to swap a small portion of the graph around $0$ with a quarter circle, but that's not nearly as smooth if you do it naively. You can do a patching like that smoothly, though, if you're smart about it, so that everywhere outside a small neighbourhood of $0$ the functions are exactly the same, and the patched function is everywhere infinitely differentiable. – Arthur Mar 26 '14 at 19:58
  • 1
    @Arthur The question is about (literal citation) "a derivative of $|x|$". This does not exist. It is out of question that you can approximate continuous functions, say, uniformly, by smooth functions. If that is what is asked for, the question should specify that. It does not. – Thomas Mar 26 '14 at 20:02
  • 3
    In fairness, both the title and the statement 'I am wondering what is the best way to approximate |x| with something smooth?' suggest a little more than an attempt to compute the derivative. – copper.hat Mar 26 '14 at 20:07
  • $\sqrt{x^2+\epsilon}$ converges uniformly to $|x|$ as $\epsilon \rightarrow 0+$. But I agree, taking the derivative of $|x|$ at $0$ is a futile effort (haha). –  Mar 26 '14 at 20:07
  • You can find a smooth sequence of functions approximating $|x|$ arbitrarily good uniformly with any prescribed value of the derivative at $x=0$. Your question does not make it clear what criterion you consider when you talk about a 'good' approximation. – Thomas Mar 26 '14 at 20:08
  • 2
    @Thomas Have you read the title to the question? Titles usually give some indication as to what is asked about. I cite litterally: "Approximate $|x|$ with a smooth function". That can be done. Let $f(x)$ be a smooth function that is zero everywhere outside some small interval $(\epsilon, \epsilon)$, tops out at $f(x) = 1$ in some even smaller interval around $1$, and monotone in between. Then $(1-f(x))|x| + f(x)h(x)$ is smooth as long as $h(x)$ is smooth inside $(-\epsilon, \epsilon)$, and it's a good approximation if $h(x)$ is say, a quarter circle or parabola "glued" to the bottom of $|x|$. – Arthur Mar 26 '14 at 20:09
  • @Arthur Yes, I read the title. Did you read my last comment? – Thomas Mar 26 '14 at 20:10
  • @Thomas Yes, I have. I see you've changed your mind from thinking this question was about differentiating $|x|$ to actually understanding what is being asked in the time it took me to write my last comment. Welcome to the club. Now, it is a legitimate inquiry to ask what criteria a "good" approximation should statisfy. I've given an idea for an approximation (or at least a way of making approximations), and I don't have any other profound ideas on how to make one. As I have nothing more to contribute here, I'm out. – Arthur Mar 26 '14 at 20:15
  • Arthur is right. Do you have any ideas? Someone mentioned the dual format, such as in Nesterov's algorithm, but I hope I can find something easier? – Ono Mar 26 '14 at 20:18
  • Can you say more about what you are going to do with your approximation? – Jason Zimba Mar 26 '14 at 20:32
  • @JasonZimba: I want to use |x| as part of an object function J(x) which I want to minimize. So it would be nice to approximate |x| with some smooth function so that I can get the analytic form of the first-order derivative of J(x). – Ono Mar 26 '14 at 20:36
  • http://en.wikipedia.org/wiki/Subderivative should help – Lemon Mar 26 '14 at 20:40
  • @Ono That was pretty similar to the way I used $\sqrt{x^2+\epsilon}$ myself. So you might go that way. You might also consider just using an expression like ${\rm sgn}(x)$ or $x\over |x|$ that is exact away from $x=0$, and handle $x=0$ carefully. – Jason Zimba Mar 26 '14 at 20:41
  • @JasonZimba:How did it work? Have you tried other approximations? – Ono Mar 26 '14 at 20:45
  • I want to thank you everyone that is trying to help here. You guys are awesome! – Ono Mar 26 '14 at 20:46
  • Do you know in which interval you need to approximate $|x|$? – Américo Tavares Mar 26 '14 at 20:51
  • @Ono It worked fine, the derivatives weren't too bad. It was actually the first thing that came to mind, and I honestly didn't cast around for very long looking at candidates.... And of course this was before the days of Stack Exchange! – Jason Zimba Mar 26 '14 at 20:52
  • 1

7 Answers7

23

A bit late to the party, but you can smoothly approximate $f(x) = |x|$ by observing $\partial f/\partial x = \mbox{sgn}(x)$ for $x \neq 0$. Therefore approximating the $\mbox{sgn}(x)$ function by

$$ f(x) = 2\mbox{ sigmoid}(kx)-1$$

(the $k$ being a parameter that allows you to control the smoothness), we get

$$ \partial f/\partial x = 2\left (\frac{e^{kx}}{1+e^{kx}} \right ) - 1 $$ $$ \Rightarrow f(x) = \frac{2}{k}\log(1+e^{kx})-x-\frac{2}{k}\log(2)$$

where the constant term was chosen to ensure $f(0) = 0 $.

Included are plots of the function for $x \in [-5,5]$, where $k=1$ is red, $k=10$ is blue and $k=100$ is black.

Note that if you wanted to use the more smooth $k=1$ in the interval $[-5,5]$ it may be worth applying a further linear transformation i.e. ~$5f(x)/3.6$ to ensure the values at the edge of the interval are correct.

enter image description here

rwolst
  • 791
13

I have used your $\sqrt{x^2+\epsilon}$ function once before, when an application I was working on called for a curve with a tight radius of curvature near $x=0$. Whether it is best for your purpose might depend on your purpose.

(Side note: The derivative of $|x|$ does not exist at $x=0$. In physics, we sometimes cheat and write ${d\over dx}|x| = \rm{sgn}(x)$, the "sign" function that gives $-1$ when $x<0$, $0$ when $x=0$, and $+1$ when $x > 0$. But only when we know that the value at $x=0$ will not get us into trouble!)

Jason Zimba
  • 2,615
3

Another solution would be the function $x \rightarrow x \frac{e^{kx}}{e^{kx} + e^{-kx}} -x\frac{e^{-kx}}{e^{kx} + e^{-kx}} $ with higher the $k$, closer you will be to the absolute value function. Furthermore it is equal to 0 in 0 wich is sometimes a desirable feature (compared to the solution with the root square stated above). However you lose the convexity.

2

From this question I did and the answer given by @LPZ you could have the following approximations for the Absolute value:

  1. $$A(x):\approx \lim_{\varepsilon\to\infty}\begin{cases}\dfrac{2}{\varepsilon},\quad x=0\\ x\coth\left(\dfrac{\varepsilon x}{2}\right),\,x\neq 0\end{cases}$$
  2. $$B(x):\approx \lim_{\varepsilon\to\infty} \dfrac{\ln\left(2\cosh\left(\varepsilon x\right)\right)}{\varepsilon}$$

You could see how they approximate $|x|$ as $\varepsilon$ increases in Desmos.

As example, $B'(x)=\tanh(\varepsilon x)$ gives you a smooth approximation for $\text{abs}'(x)$ if you allow it to be defined as $\text{sgn}(x)$.

comment added later: notice both functions aren't equivalent to the hyperbola solution $\sqrt{\varepsilon+x^2}$ given by @JasonZimba for any $\varepsilon$.

Joako
  • 1,957
1

As you and others have mentioned, functions of the form $\sqrt{x^2 + 4\mu^2}$ can be a good approximation to $\left|x\right|$ and are standard in many optimization applications where the smoothing function meets certain bounding requirements. See, for example, the paper Optimality Conditions and a Smoothing Trust Region Newton Method for Non-Lipschitz Optimization where that function is used to develop a smoothing optimization algorithm.

Another smooth approximation that has been proposed in Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches is

$$ \begin{equation} \mid x\mid \ \approx \frac{1}{\alpha} \left[ \log\left(1 + \exp(-\alpha x)\right) + \log\left(1 + \exp(\alpha x) \right)\right] \end{equation} $$ where $\alpha$ is taken to be a large positive number.

Ryan Burn
  • 150
1

Notice that $|x| = \max\{x, -x\}$. There are a few notions of Smooth Maximum and Boltzman operator version of smooth maximum is illustrated on the page as a means of approimating $|x|$. One of the simplest not mentioned on the preceding Wikipedia page is is the $\log$-semiring where in essence one can choose a large base $b$ and use:

$$\log_{b}(b^{x} + b^{-x})$$

as

$$\lim_{b \to \infty} \log_{b}(b^{x} + b^{-x}) = \max\{x, -x\} = |x|$$

Dair
  • 3,278
0

I had a similar problem and here is the solution I am using the following for x in range (-pi, pi) using fourier, it's also infinitely differentiable everywhere: $$0.04 + (-.1 + \pi - 8\cos(x)/\pi - 8\cos(3x)/9\pi -8\cos(5x)/25\pi)*0.52 - 8\cos(7x)/45\pi$$

And here is the image of the plot and y = abs(x) for comparison: Plot Picture

It also has a nice property for few cases where abs(a - b) needs to be expanded, which can be done so via the cosine rule. This becomes beneficial especially when numerically solving coupled ODEs where the $\dot Y$ depends on $\sum_{i, j}\mid(y_{j} - y_{i})\mid$.

Expanding the cosine rule here allows you to distribute the summations and precompute the sums.