0

Is there a standard/frequently used/convenient equivalent to sigmoid function with two thresholds?

Background
When writing a likelihood of data for a binary classification problem one would often parametrize the probability to be in a class by a sigmoid function: $$ P(y=1|x,a,b)=\frac{1}{1+e^{-ax-b}}, P(y=0|x,a,b)=\frac{1}{1+e^{ax+b}}, $$ which roughly corresponds to classification rule $$ y=\begin{cases} 1\text{ if } ax+b>0\Leftrightarrow x > -\frac{b}{a},\\ 0\text{ if } ax+b<0\Leftrightarrow x < -\frac{b}{a}, \end{cases} $$ that is we compare a predictive variable to a single threshold $\mu=-\frac{b}{a}$

Objective
Now, I would like to have a smooth representation, corresponding to a classification with two thresholds: $$ y=\begin{cases} 1\text{ if } \mu < x<\nu,\\ 0\text{ if } x<\mu \text{ or } x>\nu \end{cases} $$

Options
Some of the possibilities that come to mind are:
product of sigmoids $$ P(y=1|x)=\frac{1}{1+e^{a(x-\nu)}}\frac{1}{1+e^{-b(x-\mu)}} $$ difference of sigmoids
$$ P(y=1|x)=\frac{1}{1+e^{-a(x-\nu)}}-\frac{1}{1+e^{b(x-\mu)}} $$ sigmoid of a more complex argument
$$ P(y=1|x)=\frac{1}{1+e^{a(x-\nu)(x-\mu)}} $$

I wonder, whether any of these or some other representation is commonly used for such classification tasks and what are the potential advantages or disadvantages.

Roger V.
  • 119
  • 6

0 Answers0