Editing Sigmoid function

{{Short description|Mathematical function having a characteristic S-shaped curve or sigmoid curve}}  
{{Use dmy dates|date=July 2022|cs1-dates=y}} 
{{Use list-defined references|date=July 2022}}  
[[File:Logistic-curve.svg|thumb|The [[logistic curve]]]]  
[[File:Error Function.svg|thumb|Plot of the [[error function]]]]

A '''sigmoid function''' is any [[mathematical function]] whose [[graph of a function|graph]] has a characteristic S-shaped or '''sigmoid curve'''.

A common example of a sigmoid function is the [[logistic function]], which is defined by the formula<ref name="Han-Morag_1995" />
:<math>\sigma(x) = \frac{1}{1 + e^{-x}} = \frac{e^x}{1 + e^x} = 1 - \sigma(-x).</math>

Other sigmoid functions are given in the [[#Examples|Examples section]]. In some fields, most notably in the context of [[artificial neural network]]s, the term "sigmoid function" is used as a synonym for "logistic function".

Special cases of the sigmoid function include the [[Gompertz curve]] (used in modeling systems that saturate at large values of ''x'') and the [[ogee curve]] (used in the [[spillway]] of some [[dam]]s). Sigmoid functions have domain of all [[real number]]s, with return (response) value commonly [[monotonically increasing]] but could be decreasing. Sigmoid functions most often show a return value (''y'' axis) in the range 0 to 1. Another commonly used range is from −1 to 1.

A wide variety of sigmoid functions including the logistic and [[hyperbolic tangent]] functions have been used as the [[activation function]] of [[artificial neuron]]s. Sigmoid curves are also common in statistics as [[cumulative distribution function]]s (which go from 0 to 1), such as the integrals of the [[logistic density]], the [[normal density]], and [[Student's t-distribution|Student's ''t'' probability density functions]]. The logistic sigmoid function is invertible, and its inverse is the [[logit]] function.

== Definition ==
A sigmoid function is a [[bounded function|bounded]], [[differentiable function|differentiable]], real function that is defined for all real input values and has a positive derivative at each point.<ref name="Han-Morag_1995" /><ref name="yibei" />

== Properties ==
In general, a sigmoid function is [[monotonic function|monotonic]], and has a first [[derivative]] which is [[bell shaped function|bell shaped]]. Conversely, the [[integral]] of any continuous, non-negative, bell-shaped function (with one local maximum and no local minimum, unless [[Degenerate distribution|degenerate]]) will be sigmoidal. Thus the [[cumulative distribution function]]s for many common [[probability distribution]]s are sigmoidal. One such example is the [[error function]], which is related to the cumulative distribution function of a [[normal distribution]]; another is the [[arctan]] function, which is related to the cumulative distribution function of a [[Cauchy distribution]].

A sigmoid function is constrained by a pair of [[horizontal asymptote]]s as <math>x \rightarrow \pm \infty</math>.

A sigmoid function is [[convex function|convex]] for values less than a particular point, and it is [[concave function|concave]] for values greater than that point: in many of the examples here, that point is 0.

== Examples ==
[[File:Gjl-t(x).svg|thumb|320px|right|Some sigmoid functions compared. In the drawing all functions are normalized in such a way that their slope at the origin is 1.]]

* [[Logistic function]] <math display="block"> f(x) = \frac{1}{1 + e^{-x}} </math>
* [[Hyperbolic tangent]] (shifted and scaled version of the logistic function, above) <math display="block"> f(x) = \tanh x = \frac{e^x-e^{-x}}{e^x+e^{-x}} </math>
* [[Arctangent function]] <math display="block"> f(x) = \arctan x </math>
* [[Gudermannian function]] <math display="block"> f(x) = \operatorname{gd}(x) = \int_0^x \frac{dt}{\cosh t} = 2\arctan\left(\tanh\left(\frac{x}{2}\right)\right) </math>
* [[Error function]] <math display="block"> f(x) = \operatorname{erf}(x) = \frac{2}{\sqrt{\pi}} \int_0^x e^{-t^2} \, dt </math>
* [[Generalised logistic function]] <math display="block"> f(x) = \left(1 + e^{-x} \right)^{-\alpha}, \quad \alpha > 0 </math>
* [[Smoothstep]] function <math display="block"> f(x) = \begin{cases}
{\displaystyle
\left( \int_0^1 \left(1 - u^2\right)^N du \right)^{-1} \int_0^x \left( 1 - u^2 \right)^N \ du}, & |x| \le 1 \\
\\
\sgn(x) & |x| \ge 1 \\
\end{cases}  \quad N \in \mathbb{Z} \ge 1 </math>
* Some [[algebraic function]]s, for example <math display="block"> f(x) = \frac{x}{\sqrt{1+x^2}} </math>
* and in a more general form<ref name="Dunning-Kensler-Coudeville-Bailleux_2015" /> <math display="block"> f(x) = \frac{x}{\left(1 + |x|^{k}\right)^{1/k}} </math>
* Up to shifts and scaling, many sigmoids are special cases of <math display="block"> f(x) = \varphi(\varphi(x, \beta), \alpha) , </math> where <math display="block"> \varphi(x, \lambda) = \begin{cases} (1 - \lambda x)^{1/\lambda} & \lambda \ne 0 \\e^{-x} & \lambda = 0 \\  \end{cases} </math> is the inverse of the negative [[Box–Cox transformation]], and <math>\alpha < 1</math> and <math>\beta < 1</math> are shape parameters.<ref name="grex" />
* [[Non-analytic_smooth_function#Smooth_transition_functions|Smooth transition function]]<ref>{{Cite web|url=https://www.youtube.com/watch?v=vD5g8aVscUI|title=Smooth Transition Function in One Dimension &#124; Smooth Transition Function Series Part 1|via=www.youtube.com| date =16 August 2022|author=EpsilonDelta|at=13:29/14:04}}</ref> normalized to (−1,1):
<!--
<math display="block"> f(x) = \begin{cases}
{\displaystyle
2\frac{e^{\frac{1}{u}}}{e^{\frac{1}{u}}+e^{\frac{-1}{1+u}}} - 1}, u=\frac{x+1}{-2},  & |x| < 1 \\
\\
\sgn(x) & |x| \ge 1 \\
\end{cases}</math> AManWithNoPlan simplified below -->
<math display="block">\begin{align}f(x) &= \begin{cases}
{\displaystyle
\frac{2}{1+e^{-2m\frac{x}{1-x^2}}} - 1}, & |x| < 1 \\
\\
\sgn(x) & |x| \ge 1 \\
\end{cases} \\
&= \begin{cases}
{\displaystyle
\tanh\left(m\frac{x}{1-x^2}\right)}, & |x| < 1 \\
\\
\sgn(x) & |x| \ge 1 \\
\end{cases}\end{align}</math> using the hyperbolic tangent mentioned above.  Here, <math>m</math> is a free parameter encoding the slope at <math>x=0</math>, which must be greater than or equal to <math>\sqrt{3}</math> because any smaller value will result in a function with multiple inflection points, which is therefore not a true sigmoid.  This function is unusual because it actually attains the limiting values of −1 and 1 within a finite range, meaning that its value is constant at −1 for all <math>x \leq -1</math> and at 1 for all <math>x \geq 1</math>.  Nonetheless, it is [[Smoothness|smooth]] (infinitely differentiable, <math>C^\infty</math>) ''everywhere'', including at <math>x = \pm 1</math>.

== Applications ==
[[File:Gohana inverted S-curve.png|thumb|right|320px|Inverted logistic S-curve to model the relation between wheat yield and soil salinity]]

Many natural processes, such as those of complex system [[learning curve]]s, exhibit a progression from small beginnings that accelerates and approaches a climax over time.<ref>{{cite web |author1=Laurens Speelman, Yuki Numata |title=Harnessing the Power of S-Curves |url=https://rmi.org/insight/harnessing-the-power-of-s-curves/ |website=RMI |publisher=[[RMI (energy organization)|Rocky Mountain Institute]] |date=2022}}</ref> When a specific mathematical model is lacking, a sigmoid function is often used.<ref name="Gibbs_2000" />

The [[van Genuchten–Gupta model]] is based on an inverted S-curve and applied to the response of crop yield to [[soil salinity]].

Examples of the application of the logistic S-curve to the response of crop yield (wheat) to both the soil salinity and depth to [[water table]] in the soil are shown in [[logistic function#In agriculture: modeling crop response|modeling crop response in agriculture]].

In [[artificial neural network]]s, sometimes non-smooth functions are used instead for efficiency; these are known as [[hard sigmoid]]s.

In [[audio signal processing]], sigmoid functions are used as [[waveshaper]] [[transfer function]]s to emulate the sound of [[analog circuitry]] [[clipping (audio)|clipping]].<ref name="Smith_2010" />

In [[biochemistry]] and [[pharmacology]], the [[Hill equation (biochemistry)|Hill]] and [[Hill–Langmuir equation]]s are sigmoid functions.

In computer graphics and real-time rendering, some of the sigmoid functions are used to blend colors or geometry between two values, smoothly and without visible seams or discontinuities.

[[Titration curve]]s between strong acids and strong bases have a sigmoid shape due to the logarithmic nature of the [[pH scale]].

The logistic function can be calculated efficiently by utilizing [[Unum type 3|type III Unums]].<ref name="Gustafson-Yonemoto_2017" />

An hierarchy of sigmoid growth models with increasing complexity (number of parameters) was built<ref name="app-kleshtanova2023">{{cite journal
 | author = Kleshtanova, Viktoria and Ivanov, Vassil V and Hodzhaoglu, Feyzim and Prieto, Jose Emilio and Tonchev, Vesselin
 | title = Heterogeneous Substrates Modify Non-Classical Nucleation Pathways: Reanalysis of Kinetic Data from the Electrodeposition of Mercury on Platinum Using Hierarchy of Sigmoid Growth Models
 | journal = Crystals
 | volume = 13
 | number = 12
 | pages = 1690
 | year = 2023
 | publisher = MDPI
 | doi = 10.3390/cryst13121690
| doi-access = free
 | bibcode = 2023Cryst..13.1690K
 }}</ref> with the primary goal to re-analyze kinetic data, the so called N-t curves, from heterogeneous [[nucleation]] experiments,<ref name="app-Markov_1976">{{cite journal
 | author = Markov, I. and Stoycheva, E.
 | title = Saturation Nucleus Density in the Electrodeposition of Metals onto Inert Electrodes II. Experimental
 | journal = Thin Solid Films
 | volume = 35
 | number = 1
 | pages = 21–35
 | year = 1976
 | publisher = Elsevier
 | doi = 10.1016/0040-6090(76)90109-7
}}</ref> in [[electrochemistry]]. The hierarchy includes at present three models, with 1, 2 and 3 parameters, if not counting the maximal number of nuclei N<sub>max</sub>, respectively—a tanh<sup>2</sup> based model called α<sub>21</sub><ref name="app-Ivanov_2023">{{cite journal
 | author = Ivanov, V.V. and Tielemann, C. and Avramova, K. and Reinsch, S. and Tonchev, V.
 | title = Modelling Crystallization: When the Normal Growth Velocity Depends on the Supersaturation
 | journal = Journal of Physics and Chemistry of Solids
 | volume = 181
 | pages = 111542
 | year = 2023
 | publisher = Elsevier
 | doi = 10.1016/j.jpcs.2022.111542
| doi-broken-date = 28 January 2025
 }}</ref> originally devised to describe diffusion-limited crystal growth (not aggregation!) in 2D, the Johnson-Mehl-Avrami-Kolmogorov (JMAKn) model,<ref name="app-Fanfoni_1998">{{cite journal
 | author = Fanfoni, M. and Tomellini, M.
 | title = The Johnson-Mehl-Avrami-Kohnogorov Model: A Brief Review
 | journal = Il Nuovo Cimento D
 | volume = 20
 | pages = 1171–1182
 | year = 1998
 | publisher = Springer
 | doi = 10.1007/s002690050098
}}</ref> and the Richards model.<ref name="app-Tjorve_2010">{{cite journal
 | author = Tjørve, E. and Tjørve, K.M.C.
 | title = A Unified Approach to the Richards-Model Family for Use in Growth Analyses: Why We Need Only Two Model Forms
 | journal = Journal of Theoretical Biology
 | volume = 267
 | number = 3
 | pages = 417–425
 | year = 2010
 | publisher = Elsevier
 | doi = 10.1016/j.jtbi.2010.02.027
| pmid = 20176032
 }}</ref> It was shown that for the concrete purpose even the simplest model works and thus it was implied that the experiments revisited are an example of two-step nucleation with the first step being the growth of the metastable phase in which the nuclei of the stable phase form.<ref name="app-kleshtanova2023"/>

== See also ==
{{Commons category|Sigmoid functions}}
{{div col|colwidth=30em}}
* {{annotated link|Step function}}
* {{annotated link|Sign function}}
* {{annotated link|Heaviside step function}}
* {{annotated link|Logistic regression}}
* {{annotated link|Logit}}
* {{annotated link|Softplus function}}
* {{annotated link|Soboleva modified hyperbolic tangent}}
* {{annotated link|Softmax function}}
* {{annotated link|Swish function}}
* {{annotated link|Weibull distribution}}
* {{annotated link|Fermi–Dirac statistics}}
{{div col end}}

== References ==
{{reflist|refs=
<ref name="yibei">{{cite journal |title=Entropic analysis of biological growth models |author-last1=Ling|author-first1=Yibei |author-first2=Bin |author-last2=He |date=December 1993 |journal=[[IEEE Transactions on Biomedical Engineering]] |volume=40 |issue=12 |pages=1193–2000 |doi=10.1109/10.250574|pmid=8125495|url=https://ieeexplore.ieee.org/document/250574}}</ref>
<ref name="Han-Morag_1995">{{Cite book |title=From Natural to Artificial Neural Computation |author-last1=Han |author-first1=Jun |author-last2=Morag |author-first2=Claudio |volume=930 |chapter=The influence of the sigmoid function parameters on the speed of backpropagation learning |editor-last1=Mira |editor-first1=José |editor-last2=Sandoval |editor-first2=Francisco |pages=[https://archive.org/details/fromnaturaltoart1995inte/page/195 195–201] |date=1995 |doi=10.1007/3-540-59497-3_175 |series=Lecture Notes in Computer Science |isbn=978-3-540-59497-0 |chapter-url=https://archive.org/details/fromnaturaltoart1995inte/page/195}}</ref>
<ref name="Dunning-Kensler-Coudeville-Bailleux_2015">{{cite journal |title=Some extensions in continuous methods for immunological correlates of protection |author-last1=Dunning |author-first1=Andrew J. |author-first2=Jennifer |author-last2=Kensler |author-first3=Laurent |author-last3=Coudeville |author-first4=Fabrice |author-last4=Bailleux |journal=[[BMC Medical Research Methodology]] |date=2015-12-28 |volume=15 |issue=107 |page=107 |doi=10.1186/s12874-015-0096-9 |pmid=26707389 |pmc=4692073 |doi-access=free }}</ref>
<ref name="Gibbs_2000">{{cite journal |title=Variational Gaussian process classifiers |author-last1=Gibbs |author-first1=Mark N. |author-first2=D. |author-last2=Mackay |date=November 2000 |journal=[[IEEE Transactions on Neural Networks]] |volume=11 |issue=6 |pages=1458–1464 |doi=10.1109/72.883477 |pmid=18249869 |s2cid=14456885 }}</ref>
<ref name="Smith_2010">{{cite book |title=Physical Audio Signal Processing |author-last=Smith |author-first=Julius O. |date=2010 |publisher=W3K Publishing |isbn=978-0-9745607-2-4 |edition=2010 |url=https://ccrma.stanford.edu/~jos/pasp/Soft_Clipping.html |access-date=2020-03-28 |url-status=live |archive-url=https://web.archive.org/web/20220714165138/https://ccrma.stanford.edu/~jos/pasp/Soft_Clipping.html |archive-date=2022-07-14}}</ref>
<ref name="Gustafson-Yonemoto_2017">{{cite web |title=Beating Floating Point at its Own Game: Posit Arithmetic |author-first1=John L. |author-last1=Gustafson |author-link1=John L. Gustafson |author-first2=Isaac |author-last2=Yonemoto |date=2017-06-12 |url=http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf |access-date=2019-12-28 |url-status=live |archive-url=https://web.archive.org/web/20220714164957/http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf |archive-date=2022-07-14}}</ref>
<ref name="grex">{{cite web |title=grex --- Growth-curve Explorer |website=[[GitHub]] |date=9 July 2022 |url=https://github.com/ogarciav/grex |access-date=2022-08-25 |url-status=live |archive-url=https://web.archive.org/web/20220825202325/https://github.com/ogarciav/grex |archive-date=2022-08-25}}</ref>
}}

== Further reading ==
* {{cite book |title=Machine Learning |author-first=Tom M. |author-last=Mitchell |publisher=WCB [[McGraw–Hill]] |date=1997 |isbn=978-0-07-042807-2}}. (NB. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp.&nbsp;96–97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously – this function he also calls the "squashing function" – and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.)
* {{cite web |title=Continuous output, the sigmoid function |author-first=Mark |author-last=Humphrys |url=http://www.computing.dcu.ie/~humphrys/Notes/Neural/sigmoid.html |access-date=2022-07-14 |url-status=live |archive-url=https://web.archive.org/web/20220714165249/https://humphryscomputing.com/Notes/Neural/sigmoid.html |archive-date=2022-07-14}} (NB. Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.)

== External links ==
* {{cite web|archive-url=https://web.archive.org/web/20220714181630/https://www.waterlog.info/sigmoid.htm|url-status=live |url=https://www.waterlog.info/sigmoid.htm |title=Fitting of logistic S-curves (sigmoids) to data using SegRegA|archive-date=14 July 2022 }}

{{Artificial intelligence navbox}}

[[Category:Elementary special functions]]
[[Category:Artificial neural networks]]