Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sigmoid function
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Mathematical function having a characteristic S-shaped curve or sigmoid curve}} {{Use dmy dates|date=July 2022|cs1-dates=y}} {{Use list-defined references|date=July 2022}} [[File:Logistic-curve.svg|thumb|The [[logistic curve]]]] [[File:Error Function.svg|thumb|Plot of the [[error function]]]] A '''sigmoid function''' is any [[mathematical function]] whose [[graph of a function|graph]] has a characteristic S-shaped or '''sigmoid curve'''. A common example of a sigmoid function is the [[logistic function]], which is defined by the formula<ref name="Han-Morag_1995" /> :<math>\sigma(x) = \frac{1}{1 + e^{-x}} = \frac{e^x}{1 + e^x} = 1 - \sigma(-x).</math> Other sigmoid functions are given in the [[#Examples|Examples section]]. In some fields, most notably in the context of [[artificial neural network]]s, the term "sigmoid function" is used as a synonym for "logistic function". Special cases of the sigmoid function include the [[Gompertz curve]] (used in modeling systems that saturate at large values of ''x'') and the [[ogee curve]] (used in the [[spillway]] of some [[dam]]s). Sigmoid functions have domain of all [[real number]]s, with return (response) value commonly [[monotonically increasing]] but could be decreasing. Sigmoid functions most often show a return value (''y'' axis) in the range 0 to 1. Another commonly used range is from −1 to 1. A wide variety of sigmoid functions including the logistic and [[hyperbolic tangent]] functions have been used as the [[activation function]] of [[artificial neuron]]s. Sigmoid curves are also common in statistics as [[cumulative distribution function]]s (which go from 0 to 1), such as the integrals of the [[logistic density]], the [[normal density]], and [[Student's t-distribution|Student's ''t'' probability density functions]]. The logistic sigmoid function is invertible, and its inverse is the [[logit]] function. == Definition == A sigmoid function is a [[bounded function|bounded]], [[differentiable function|differentiable]], real function that is defined for all real input values and has a positive derivative at each point.<ref name="Han-Morag_1995" /><ref name="yibei" /> == Properties == In general, a sigmoid function is [[monotonic function|monotonic]], and has a first [[derivative]] which is [[bell shaped function|bell shaped]]. Conversely, the [[integral]] of any continuous, non-negative, bell-shaped function (with one local maximum and no local minimum, unless [[Degenerate distribution|degenerate]]) will be sigmoidal. Thus the [[cumulative distribution function]]s for many common [[probability distribution]]s are sigmoidal. One such example is the [[error function]], which is related to the cumulative distribution function of a [[normal distribution]]; another is the [[arctan]] function, which is related to the cumulative distribution function of a [[Cauchy distribution]]. A sigmoid function is constrained by a pair of [[horizontal asymptote]]s as <math>x \rightarrow \pm \infty</math>. A sigmoid function is [[convex function|convex]] for values less than a particular point, and it is [[concave function|concave]] for values greater than that point: in many of the examples here, that point is 0. == Examples == [[File:Gjl-t(x).svg|thumb|320px|right|Some sigmoid functions compared. In the drawing all functions are normalized in such a way that their slope at the origin is 1.]] * [[Logistic function]] <math display="block"> f(x) = \frac{1}{1 + e^{-x}} </math> * [[Hyperbolic tangent]] (shifted and scaled version of the logistic function, above) <math display="block"> f(x) = \tanh x = \frac{e^x-e^{-x}}{e^x+e^{-x}} </math> * [[Arctangent function]] <math display="block"> f(x) = \arctan x </math> * [[Gudermannian function]] <math display="block"> f(x) = \operatorname{gd}(x) = \int_0^x \frac{dt}{\cosh t} = 2\arctan\left(\tanh\left(\frac{x}{2}\right)\right) </math> * [[Error function]] <math display="block"> f(x) = \operatorname{erf}(x) = \frac{2}{\sqrt{\pi}} \int_0^x e^{-t^2} \, dt </math> * [[Generalised logistic function]] <math display="block"> f(x) = \left(1 + e^{-x} \right)^{-\alpha}, \quad \alpha > 0 </math> * [[Smoothstep]] function <math display="block"> f(x) = \begin{cases} {\displaystyle \left( \int_0^1 \left(1 - u^2\right)^N du \right)^{-1} \int_0^x \left( 1 - u^2 \right)^N \ du}, & |x| \le 1 \\ \\ \sgn(x) & |x| \ge 1 \\ \end{cases} \quad N \in \mathbb{Z} \ge 1 </math> * Some [[algebraic function]]s, for example <math display="block"> f(x) = \frac{x}{\sqrt{1+x^2}} </math> * and in a more general form<ref name="Dunning-Kensler-Coudeville-Bailleux_2015" /> <math display="block"> f(x) = \frac{x}{\left(1 + |x|^{k}\right)^{1/k}} </math> * Up to shifts and scaling, many sigmoids are special cases of <math display="block"> f(x) = \varphi(\varphi(x, \beta), \alpha) , </math> where <math display="block"> \varphi(x, \lambda) = \begin{cases} (1 - \lambda x)^{1/\lambda} & \lambda \ne 0 \\e^{-x} & \lambda = 0 \\ \end{cases} </math> is the inverse of the negative [[Box–Cox transformation]], and <math>\alpha < 1</math> and <math>\beta < 1</math> are shape parameters.<ref name="grex" /> * [[Non-analytic_smooth_function#Smooth_transition_functions|Smooth transition function]]<ref>{{Cite web|url=https://www.youtube.com/watch?v=vD5g8aVscUI|title=Smooth Transition Function in One Dimension | Smooth Transition Function Series Part 1|via=www.youtube.com| date =16 August 2022|author=EpsilonDelta|at=13:29/14:04}}</ref> normalized to (−1,1): <!-- <math display="block"> f(x) = \begin{cases} {\displaystyle 2\frac{e^{\frac{1}{u}}}{e^{\frac{1}{u}}+e^{\frac{-1}{1+u}}} - 1}, u=\frac{x+1}{-2}, & |x| < 1 \\ \\ \sgn(x) & |x| \ge 1 \\ \end{cases}</math> AManWithNoPlan simplified below --> <math display="block">\begin{align}f(x) &= \begin{cases} {\displaystyle \frac{2}{1+e^{-2m\frac{x}{1-x^2}}} - 1}, & |x| < 1 \\ \\ \sgn(x) & |x| \ge 1 \\ \end{cases} \\ &= \begin{cases} {\displaystyle \tanh\left(m\frac{x}{1-x^2}\right)}, & |x| < 1 \\ \\ \sgn(x) & |x| \ge 1 \\ \end{cases}\end{align}</math> using the hyperbolic tangent mentioned above. Here, <math>m</math> is a free parameter encoding the slope at <math>x=0</math>, which must be greater than or equal to <math>\sqrt{3}</math> because any smaller value will result in a function with multiple inflection points, which is therefore not a true sigmoid. This function is unusual because it actually attains the limiting values of −1 and 1 within a finite range, meaning that its value is constant at −1 for all <math>x \leq -1</math> and at 1 for all <math>x \geq 1</math>. Nonetheless, it is [[Smoothness|smooth]] (infinitely differentiable, <math>C^\infty</math>) ''everywhere'', including at <math>x = \pm 1</math>. == Applications == [[File:Gohana inverted S-curve.png|thumb|right|320px|Inverted logistic S-curve to model the relation between wheat yield and soil salinity]] Many natural processes, such as those of complex system [[learning curve]]s, exhibit a progression from small beginnings that accelerates and approaches a climax over time.<ref>{{cite web |author1=Laurens Speelman, Yuki Numata |title=Harnessing the Power of S-Curves |url=https://rmi.org/insight/harnessing-the-power-of-s-curves/ |website=RMI |publisher=[[RMI (energy organization)|Rocky Mountain Institute]] |date=2022}}</ref> When a specific mathematical model is lacking, a sigmoid function is often used.<ref name="Gibbs_2000" /> The [[van Genuchten–Gupta model]] is based on an inverted S-curve and applied to the response of crop yield to [[soil salinity]]. Examples of the application of the logistic S-curve to the response of crop yield (wheat) to both the soil salinity and depth to [[water table]] in the soil are shown in [[logistic function#In agriculture: modeling crop response|modeling crop response in agriculture]]. In [[artificial neural network]]s, sometimes non-smooth functions are used instead for efficiency; these are known as [[hard sigmoid]]s. In [[audio signal processing]], sigmoid functions are used as [[waveshaper]] [[transfer function]]s to emulate the sound of [[analog circuitry]] [[clipping (audio)|clipping]].<ref name="Smith_2010" /> In [[biochemistry]] and [[pharmacology]], the [[Hill equation (biochemistry)|Hill]] and [[Hill–Langmuir equation]]s are sigmoid functions. In computer graphics and real-time rendering, some of the sigmoid functions are used to blend colors or geometry between two values, smoothly and without visible seams or discontinuities. [[Titration curve]]s between strong acids and strong bases have a sigmoid shape due to the logarithmic nature of the [[pH scale]]. The logistic function can be calculated efficiently by utilizing [[Unum type 3|type III Unums]].<ref name="Gustafson-Yonemoto_2017" /> An hierarchy of sigmoid growth models with increasing complexity (number of parameters) was built<ref name="app-kleshtanova2023">{{cite journal | author = Kleshtanova, Viktoria and Ivanov, Vassil V and Hodzhaoglu, Feyzim and Prieto, Jose Emilio and Tonchev, Vesselin | title = Heterogeneous Substrates Modify Non-Classical Nucleation Pathways: Reanalysis of Kinetic Data from the Electrodeposition of Mercury on Platinum Using Hierarchy of Sigmoid Growth Models | journal = Crystals | volume = 13 | number = 12 | pages = 1690 | year = 2023 | publisher = MDPI | doi = 10.3390/cryst13121690 | doi-access = free | bibcode = 2023Cryst..13.1690K }}</ref> with the primary goal to re-analyze kinetic data, the so called N-t curves, from heterogeneous [[nucleation]] experiments,<ref name="app-Markov_1976">{{cite journal | author = Markov, I. and Stoycheva, E. | title = Saturation Nucleus Density in the Electrodeposition of Metals onto Inert Electrodes II. Experimental | journal = Thin Solid Films | volume = 35 | number = 1 | pages = 21–35 | year = 1976 | publisher = Elsevier | doi = 10.1016/0040-6090(76)90109-7 }}</ref> in [[electrochemistry]]. The hierarchy includes at present three models, with 1, 2 and 3 parameters, if not counting the maximal number of nuclei N<sub>max</sub>, respectively—a tanh<sup>2</sup> based model called α<sub>21</sub><ref name="app-Ivanov_2023">{{cite journal | author = Ivanov, V.V. and Tielemann, C. and Avramova, K. and Reinsch, S. and Tonchev, V. | title = Modelling Crystallization: When the Normal Growth Velocity Depends on the Supersaturation | journal = Journal of Physics and Chemistry of Solids | volume = 181 | pages = 111542 | year = 2023 | publisher = Elsevier | doi = 10.1016/j.jpcs.2022.111542 | doi-broken-date = 28 January 2025 }}</ref> originally devised to describe diffusion-limited crystal growth (not aggregation!) in 2D, the Johnson-Mehl-Avrami-Kolmogorov (JMAKn) model,<ref name="app-Fanfoni_1998">{{cite journal | author = Fanfoni, M. and Tomellini, M. | title = The Johnson-Mehl-Avrami-Kohnogorov Model: A Brief Review | journal = Il Nuovo Cimento D | volume = 20 | pages = 1171–1182 | year = 1998 | publisher = Springer | doi = 10.1007/s002690050098 }}</ref> and the Richards model.<ref name="app-Tjorve_2010">{{cite journal | author = Tjørve, E. and Tjørve, K.M.C. | title = A Unified Approach to the Richards-Model Family for Use in Growth Analyses: Why We Need Only Two Model Forms | journal = Journal of Theoretical Biology | volume = 267 | number = 3 | pages = 417–425 | year = 2010 | publisher = Elsevier | doi = 10.1016/j.jtbi.2010.02.027 | pmid = 20176032 }}</ref> It was shown that for the concrete purpose even the simplest model works and thus it was implied that the experiments revisited are an example of two-step nucleation with the first step being the growth of the metastable phase in which the nuclei of the stable phase form.<ref name="app-kleshtanova2023"/> == See also == {{Commons category|Sigmoid functions}} {{div col|colwidth=30em}} * {{annotated link|Step function}} * {{annotated link|Sign function}} * {{annotated link|Heaviside step function}} * {{annotated link|Logistic regression}} * {{annotated link|Logit}} * {{annotated link|Softplus function}} * {{annotated link|Soboleva modified hyperbolic tangent}} * {{annotated link|Softmax function}} * {{annotated link|Swish function}} * {{annotated link|Weibull distribution}} * {{annotated link|Fermi–Dirac statistics}} {{div col end}} == References == {{reflist|refs= <ref name="yibei">{{cite journal |title=Entropic analysis of biological growth models |author-last1=Ling|author-first1=Yibei |author-first2=Bin |author-last2=He |date=December 1993 |journal=[[IEEE Transactions on Biomedical Engineering]] |volume=40 |issue=12 |pages=1193–2000 |doi=10.1109/10.250574|pmid=8125495|url=https://ieeexplore.ieee.org/document/250574}}</ref> <ref name="Han-Morag_1995">{{Cite book |title=From Natural to Artificial Neural Computation |author-last1=Han |author-first1=Jun |author-last2=Morag |author-first2=Claudio |volume=930 |chapter=The influence of the sigmoid function parameters on the speed of backpropagation learning |editor-last1=Mira |editor-first1=José |editor-last2=Sandoval |editor-first2=Francisco |pages=[https://archive.org/details/fromnaturaltoart1995inte/page/195 195–201] |date=1995 |doi=10.1007/3-540-59497-3_175 |series=Lecture Notes in Computer Science |isbn=978-3-540-59497-0 |chapter-url=https://archive.org/details/fromnaturaltoart1995inte/page/195}}</ref> <ref name="Dunning-Kensler-Coudeville-Bailleux_2015">{{cite journal |title=Some extensions in continuous methods for immunological correlates of protection |author-last1=Dunning |author-first1=Andrew J. |author-first2=Jennifer |author-last2=Kensler |author-first3=Laurent |author-last3=Coudeville |author-first4=Fabrice |author-last4=Bailleux |journal=[[BMC Medical Research Methodology]] |date=2015-12-28 |volume=15 |issue=107 |page=107 |doi=10.1186/s12874-015-0096-9 |pmid=26707389 |pmc=4692073 |doi-access=free }}</ref> <ref name="Gibbs_2000">{{cite journal |title=Variational Gaussian process classifiers |author-last1=Gibbs |author-first1=Mark N. |author-first2=D. |author-last2=Mackay |date=November 2000 |journal=[[IEEE Transactions on Neural Networks]] |volume=11 |issue=6 |pages=1458–1464 |doi=10.1109/72.883477 |pmid=18249869 |s2cid=14456885 }}</ref> <ref name="Smith_2010">{{cite book |title=Physical Audio Signal Processing |author-last=Smith |author-first=Julius O. |date=2010 |publisher=W3K Publishing |isbn=978-0-9745607-2-4 |edition=2010 |url=https://ccrma.stanford.edu/~jos/pasp/Soft_Clipping.html |access-date=2020-03-28 |url-status=live |archive-url=https://web.archive.org/web/20220714165138/https://ccrma.stanford.edu/~jos/pasp/Soft_Clipping.html |archive-date=2022-07-14}}</ref> <ref name="Gustafson-Yonemoto_2017">{{cite web |title=Beating Floating Point at its Own Game: Posit Arithmetic |author-first1=John L. |author-last1=Gustafson |author-link1=John L. Gustafson |author-first2=Isaac |author-last2=Yonemoto |date=2017-06-12 |url=http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf |access-date=2019-12-28 |url-status=live |archive-url=https://web.archive.org/web/20220714164957/http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf |archive-date=2022-07-14}}</ref> <ref name="grex">{{cite web |title=grex --- Growth-curve Explorer |website=[[GitHub]] |date=9 July 2022 |url=https://github.com/ogarciav/grex |access-date=2022-08-25 |url-status=live |archive-url=https://web.archive.org/web/20220825202325/https://github.com/ogarciav/grex |archive-date=2022-08-25}}</ref> }} == Further reading == * {{cite book |title=Machine Learning |author-first=Tom M. |author-last=Mitchell |publisher=WCB [[McGraw–Hill]] |date=1997 |isbn=978-0-07-042807-2}}. (NB. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp. 96–97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously – this function he also calls the "squashing function" – and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.) * {{cite web |title=Continuous output, the sigmoid function |author-first=Mark |author-last=Humphrys |url=http://www.computing.dcu.ie/~humphrys/Notes/Neural/sigmoid.html |access-date=2022-07-14 |url-status=live |archive-url=https://web.archive.org/web/20220714165249/https://humphryscomputing.com/Notes/Neural/sigmoid.html |archive-date=2022-07-14}} (NB. Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.) == External links == * {{cite web|archive-url=https://web.archive.org/web/20220714181630/https://www.waterlog.info/sigmoid.htm|url-status=live |url=https://www.waterlog.info/sigmoid.htm |title=Fitting of logistic S-curves (sigmoids) to data using SegRegA|archive-date=14 July 2022 }} {{Artificial intelligence navbox}} [[Category:Elementary special functions]] [[Category:Artificial neural networks]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Annotated link
(
edit
)
Template:Artificial intelligence navbox
(
edit
)
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Commons category
(
edit
)
Template:Div col
(
edit
)
Template:Div col end
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Use dmy dates
(
edit
)
Template:Use list-defined references
(
edit
)