Editing Logit

{{Short description|Function in statistics}}
{{About|the binary logit function|other types of logit|discrete choice|the basic regression technique that uses the logit function|logistic regression|standard magnitudes combined by multiplication|logit (unit)}}
{{Distinguish|log probability}}
[[Image:Logit.svg|thumbnail|upright=1.3|Plot of logit(''x'') in the domain of 0 to 1, where the base of the logarithm is ''e''.]]

In [[statistics]], the '''logit''' ({{IPAc-en|ˈ|l|oʊ|dʒ|ɪ|t}} {{respell|LOH|jit}}) function is the [[quantile function]] associated with the standard [[logistic distribution]]. It has many uses in [[data analysis]] and [[machine learning]], especially in [[Data transformation (statistics)|data transformations]].

Mathematically, the logit is the [[inverse function|inverse]] of the [[logistic function|standard logistic function]] <math>\sigma(x) = 1/(1+e^{-x})</math>, so the logit is defined as
: <math>\operatorname{logit} p = \sigma^{-1}(p) = \ln \frac{p}{1-p} \quad \text{for} \quad p \in (0,1).</math>

Because of this, the logit is also called the '''log-odds''' since it is equal to the [[logarithm]] of the [[odds]] <math>\frac{p}{1-p}</math> where {{mvar|p}} is a probability. Thus, the logit is a type of function that maps probability values from <math>(0, 1)</math> to real numbers in <math>(-\infty, +\infty)</math>,<ref>{{Cite web|url=http://www.columbia.edu/~so33/SusDev/Lecture_9.pdf|title=Logit/Probit}}</ref> akin to the [[probit|probit function]].

== Definition ==
If {{mvar|p}} is a [[probability]], then {{math| {{nowrap|''p''/(1 &minus; ''p'')}} }} is the corresponding [[odds]]; the {{math|logit}} of the probability is the logarithm of the odds, i.e.:
: <math>\operatorname{logit}(p)=\ln\left( \frac{p}{1-p} \right) =\ln(p)-\ln(1-p)=-\ln\left( \frac{1}{p}-1\right)=2\operatorname{atanh}(2p-1). </math>

The base of the [[logarithm]] function used is of little importance in the present article, as long as it is greater than 1, but the [[natural logarithm]] with base {{mvar|[[e (mathematical constant)|e]]}} is the one most often used. The choice of base corresponds to the choice of [[logarithmic unit]] for the value: base&nbsp;2 corresponds to a [[shannon (unit)|shannon]], base&nbsp;{{mvar|e}} to a [[nat (unit)|nat]], and base&nbsp;10 to a [[hartley (unit)|hartley]]; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity.

The [[logistic function|“logistic” function]] of any number <math>\alpha</math> is given by the inverse-{{math|logit}}:
: <math>\operatorname{logit}^{-1}(\alpha) = \operatorname{logistic}(\alpha) = \frac{1}{1 + \exp(-\alpha)} = \frac{\exp(\alpha)}{ \exp(\alpha) + 1} = \frac{\tanh(\frac{\alpha}{2})+1}{2}</math>

The difference between the {{math|logit}}s of two probabilities is the logarithm of the [[odds ratio]] ({{mvar|R}}), thus providing a shorthand for writing the correct combination of odds ratios [[additive function|only by adding and subtracting]]:
: <math>\ln(R)=\ln\left( \frac{p_1/(1-p_1)}{p_2/(1-p_2)} \right) =\ln\left( \frac{p_1}{1-p_1} \right) - \ln\left(\frac{p_2}{1-p_2}\right) = \operatorname{logit}(p_1)-\operatorname{logit}(p_2)\,.</math>

The [[Taylor series]] for the logit function is given by:
:<math>\operatorname{logit}(x)=2\sum_{n=0}^\infty \frac{(2x-1)^{2n+1}}{2n+1}.</math>

== History ==
Several approaches have been explored to adapt linear regression methods to a domain where the output is a probability value <math>(0, 1)</math>, instead of any real number <math>(-\infty, +\infty)</math>. In many cases, such efforts have focused on modeling this problem by mapping the range <math>(0, 1)</math> to <math>(-\infty, +\infty)</math> and then running the linear regression on these transformed values.<ref name="Cramer2003"/>

In 1934, [[Chester Ittner Bliss]] used the cumulative normal distribution function to perform this mapping and called his model [[probit]], an abbreviation for "'''prob'''ability un'''it'''". This is, however, computationally more expensive.<ref name="Cramer2003">{{Cite web |url=http://www.cambridge.org/resources/0521815886/1208_default.pdf |title=The origins and development of the logit model |first=J. S. |last=Cramer |year=2003 |publisher=Cambridge UP |archive-url=https://web.archive.org/web/20240919043104/https://www.cambridge.org/resources/0521815886/1208_default.pdf |archive-date=19 September 2024 |url-status=dead }}</ref>

In 1944, [[Joseph Berkson]] used log of odds and called this function ''logit'', an abbreviation for "'''log'''istic un'''it'''", following the analogy for probit:
{{quote|"I use this term [logit] for <math>\ln p/q</math> following Bliss, who called the analogous function which is linear on {{tmath|x}} for the normal curve 'probit'."|Joseph Berkson (1944){{sfn|Berkson|1944|loc=p. 361, footnote 2}}}}

Log odds was used extensively by [[Charles Sanders Peirce]] (late 19th century).<ref>{{cite book |title=The history of statistics : the measurement of uncertainty before 1900 |last=Stigler |first=Stephen M. |author-link=Stephen M. Stigler |year=1986 |publisher=Belknap Press of Harvard University Press |location=Cambridge, Massachusetts |isbn=978-0-674-40340-6 |url-access=registration |url=https://archive.org/details/historyofstatist00stig }}</ref> [[G. A. Barnard]] in 1949 coined the commonly used term ''log-odds'';<ref>{{citation|title=Logistic Regression Models|first=Joseph M.|last=Hilbe|authorlink=Joseph Hilbe|publisher=CRC Press|year=2009|isbn=9781420075779|page=3|url=https://books.google.com/books?id=tmHMBQAAQBAJ&pg=PA3}}.</ref>{{sfn|Barnard|1949|p=120}} the log-odds of an event is the logit of the probability of the event.<ref>{{citation|title=Logit Models from Economics and Other Fields|first=J. S.|last=Cramer|publisher=Cambridge University Press|year=2003|isbn=9781139438193|page=13|url=https://books.google.com/books?id=1Od2d72pPXUC&pg=PA13}}.</ref> Barnard also coined the term ''lods'' as an abstract form of "log-odds",{{sfn|Barnard|1949|p=120,128}} but suggested that "in practice the term 'odds' should normally be used, since this is more familiar in everyday life".{{sfn|Barnard|1949|p=136}}

== Uses and properties ==
* The logit in [[logistic regression]] is a special case of a link function in a [[generalized linear model]]: it is the canonical [[link function]] for the [[Bernoulli distribution]].
* More abstractly, the logit is the [[natural parameter]] for the [[binomial distribution]]; see {{slink|Exponential family|Binomial distribution}}.
* The logit function is the negative of the [[derivative]] of the [[binary entropy function]].
* The logit is also central to the probabilistic [[Rasch model]] for [[measurement]], which has applications in psychological and educational assessment, among other areas.
* The inverse-logit function (i.e., the [[logistic function]]) is also sometimes referred to as the ''expit'' function.<ref>{{cite web |url=http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/msm/html/expit.html |title=R: Inverse logit function |access-date=2011-02-18 |url-status=dead |archive-url=https://web.archive.org/web/20110706132209/http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/msm/html/expit.html |archive-date=2011-07-06 }}</ref>
* In plant disease epidemiology, the logistic, Gompertz, and monomolecular models are collectively known as the Richards family models.
* The log-odds function of probabilities is often used in state estimation algorithms<ref>{{cite journal |last=Thrun|first=Sebastian |title=Learning Occupancy Grid Maps with Forward Sensor Models |journal=Autonomous Robots |language=en |volume=15|issue=2|pages=111–127 |doi=10.1023/A:1025584807625|issn=0929-5593|year=2003|s2cid=2279013 |url=https://mediawiki.isr.tecnico.ulisboa.pt/images/5/5b/Thrun03.pdf }}</ref> because of its numerical advantages in the case of small probabilities. Instead of multiplying very small floating point numbers, log-odds probabilities can just be summed up to calculate the (log-odds) joint probability.<ref>{{cite web |url=https://www.cs.cmu.edu/~16831-f12/notes/F12/16831_lecture05_vh.pdf |title=Statistical Techniques in Robotics |last=Styler|first=Alex |date=2012 |page=2 |access-date=2017-01-26 }}</ref><ref>{{cite journal |last1=Dickmann|first1=J. |last2=Appenrodt|first2=N. |last3=Klappstein|first3=J. |last4=Bloecher|first4=H. L. |last5=Muntzinger|first5=M. |last6=Sailer|first6=A. |last7=Hahn|first7=M. |last8=Brenk|first8=C. |date=2015-01-01 |title=Making Bertha See Even More: Radar Contribution |journal=IEEE Access |volume=3|pages=1233–1247 |doi=10.1109/ACCESS.2015.2454533|doi-access=free |issn=2169-3536|bibcode=2015IEEEA...3.1233D }}</ref>

== Comparison with probit ==
[[File:Logit-probit.svg|right|300px|thumb|Comparison of the logit function with a scaled [[probit]] (i.e. the inverse [[cumulative distribution function|CDF]] of the [[normal distribution]]), comparing <math>\operatorname{logit}(x)</math> vs. <math>\tfrac{\Phi^{-1}(x)}{\,\sqrt{\pi/8\,}\,}</math>, which makes the slopes the same at the {{mvar|y}}-origin.]]

Closely related to the {{math|logit}} function (and [[logit model]]) are the [[probit function]] and [[probit model]]. The {{math|logit}} and {{math|probit}} are both [[sigmoid function]]s with a domain between 0 and 1, which makes them both [[quantile function]]s – i.e., inverses of the [[cumulative distribution function]] (CDF) of a [[probability distribution]]. In fact, the {{math|logit}} is the [[quantile function]] of the [[logistic distribution]], while the {{math|probit}} is the quantile function of the [[normal distribution]]. The {{math|probit}} function is denoted <math>\Phi^{-1}(x)</math>, where <math>\Phi(x)</math> is the [[cumulative distribution function|CDF]] of the standard normal distribution, as just mentioned:
: <math>\Phi(x) = \frac 1 {\sqrt{2\pi}}\int_{-\infty}^x  e^{-y^2/2} dy.</math>

As shown in the graph on the right, the {{math|logit}} and {{math|probit}} functions are extremely similar when the {{math|probit}} function is scaled, so that its slope at {{math|''y'' {{=}} 0}} matches the slope of the {{math|logit}}. As a result, [[probit model]]s are sometimes used in place of [[logit model]]s because for certain applications (e.g., in [[item response theory]]) the implementation is easier.<ref>{{cite book |first=James H. |last=Albert |chapter=Logit, Probit, and other Response Functions |title=Handbook of Item Response Theory |volume=Two |location= |publisher=Chapman and Hall |year=2016 |isbn= 978-1-315-37364-5|pages=3–22 |doi=10.1201/b19166-1 |url=https://books.google.de/books?id=NWymCwAAQBAJ&pg=PA3 }}</ref>

== See also ==
* [[Sigmoid function]], inverse of the logit function
* [[Discrete choice]] on binary logit, multinomial logit, conditional logit, nested logit, mixed logit, exploded logit, and ordered logit
* [[Limited dependent variable]]
* [[Logit analysis in marketing]]
* [[Multinomial logit]]
* [[Ogee]], curve with similar shape
* [[Perceptron]]
* [[Probit]], another function with the same domain and range as the logit
* [[Ridit scoring]]
* [[Data transformation (statistics)]]
* [[Arcsin]] (transformation)
* [[Rasch model]]

== References ==
{{More footnotes needed|date=November 2010}}

{{reflist}}
{{refbegin}}
* {{cite journal |title=Application of the Logistic Function to Bio-Assay |first=Joseph |last=Berkson |authorlink=Joseph Berkson |journal=[[Journal of the American Statistical Association]] |volume=39 |issue=227 (September) |year=1944 |pages=357–365 |doi=10.2307/2280041 |jstor=2280041}}
* {{cite journal |last=Barnard |first=George Alfred |authorlink=George Alfred Barnard |year=1949 |title=Statistical Inference |journal=Journal of the Royal Statistical Society |series=B |volume=11 |number=2 |pages=115–139 |doi=10.1111/j.2517-6161.1949.tb00028.x |jstor=2984075 |doi-access=free }}
{{refend}}

== External links ==
* [https://bayesium.com/which-link-function-logit-probit-or-cloglog/ Which Link Function — Logit, Probit, or Cloglog? 12.04.2023]

== Further reading ==
* {{cite book|last=Ashton|first=Winifred D.|title=The Logit Transformation: with special reference to its uses in Bioassay|year=1972|publisher=Charles Griffin|isbn=978-0-85264-212-2|series=Griffin's Statistical Monographs & Courses|volume= 32 |doi=10.2307/2345009 }}

[[Category:Logarithms]]
[[Category:Special functions]]