Editing Log-normal distribution (section)

== Properties ==

[[File:Probabilities of log normal.png|thumb|right|a. <math>y</math> is a log-normal variable with {{nowrap|<math>\mu = 1</math>,}} {{nowrap|<math>\sigma = 0.5</math>.}} <math>p(\sin y>0)</math> is computed by transforming to the normal variable <math>x = \ln y</math>, then integrating its density over the domain defined by <math>\sin e^x>0</math> (blue regions), using the numerical method of ray-tracing.<ref name="Das" /> b & c. The pdf and cdf of the function <math> \sin y</math> of the log-normal variable can also be computed in this way.]]

===Probability in different domains===
The probability content of a log-normal distribution in any arbitrary domain can be computed to desired precision by first transforming the variable to normal, then numerically integrating using the ray-trace method.<ref name="Das">{{cite journal | last = Das | first = Abhranil | arxiv = 2012.14331 | title = A method to integrate and classify normal distributions | journal = Journal of Vision | date = 2021 | volume = 21 | issue = 10 | page = 1 | doi = 10.1167/jov.21.10.1 | pmid = 34468706 | pmc = 8419883 }}</ref> ([https://www.mathworks.com/matlabcentral/fileexchange/84973-integrate-and-classify-normal-distributions Matlab code])

===Probabilities of functions of a log-normal variable===
Since the probability of a log-normal can be computed in any domain, this means that the cdf (and consequently pdf and inverse cdf) of any function of a log-normal variable can also be computed.<ref name="Das"/> ([https://www.mathworks.com/matlabcentral/fileexchange/84973-integrate-and-classify-normal-distributions Matlab code])

===Geometric or multiplicative moments===

The [[geometric mean|geometric or multiplicative mean]] of the log-normal distribution is <math>\operatorname{GM}[X] = e^\mu = \mu^*</math>. It equals the median. The [[geometric standard deviation|geometric or multiplicative standard deviation]] is <math>\operatorname{GSD}[X] = e^{\sigma} = \sigma^*</math>.<ref name="ReferenceA">{{cite journal | last1 = Kirkwood | first1 = Thomas BL | title = Geometric means and measures of dispersion | journal = Biometrics | date = Dec 1979 | volume = 35 | issue = 4 | pages = 908–909 | jstor = 2530139 }}</ref><ref>{{cite journal | last1 = Limpert | first1 = E | last2 = Stahel | first2 = W | last3 = Abbt | first3 = M | title = Lognormal distributions across the sciences: keys and clues | journal = BioScience | year = 2001 | volume = 51 | issue = 5 | pages = 341–352 | doi = 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2 | doi-access = free }}</ref>

By analogy with the arithmetic statistics, one can define a geometric variance, <math>\operatorname{GVar}[X] = e^{\sigma^2}</math>, and a [[Coefficient of variation#Log-normal data|geometric coefficient of variation]],<ref name="ReferenceA" /> <math>\operatorname{GCV}[X] = e^{\sigma} - 1</math>, has been proposed. This term was intended to be ''analogous'' to the coefficient of variation, for describing multiplicative variation in log-normal data, but this definition of GCV has no theoretical basis as an estimate of <math>\operatorname{CV}</math> itself (see also [[Coefficient of variation]]).

Note that the geometric mean is smaller than the arithmetic mean. This is due to the [[AM–GM inequality]] and is a consequence of the logarithm being a [[concave function]]. In fact,<ref name="Acoustic Stimuli Revisited 2016">{{cite journal | last1 = Heil P | first1 = Friedrich B | title = Onset-Duration Matching of Acoustic Stimuli Revisited: Conventional Arithmetic vs. Proposed Geometric Measures of Accuracy and Precision | journal = Frontiers in Psychology | volume = 7 | page = 2013 | doi = 10.3389/fpsyg.2016.02013 | pmid = 28111557 | pmc = 5216879 | year = 2017 | doi-access = free}}</ref>

<math display="block">\operatorname{E}[X] = e^{\mu + \frac12 \sigma^2} = e^{\mu} \cdot \sqrt{e^{\sigma^2}} = \operatorname{GM}[X] \cdot \sqrt{\operatorname{GVar}[X]}.</math>

In finance, the term <math>e^{-\sigma^2/2}</math> is sometimes interpreted as a [[convexity correction]]. From the point of view of [[stochastic calculus]], this is the same correction term as in [[Itō's lemma#Geometric Brownian motion|Itō's lemma for geometric Brownian motion]].

===Arithmetic moments===

For any real or complex number {{mvar|n}}, the {{mvar|n}}-th [[moment (mathematics)|moment]] of a log-normally distributed variable {{mvar|X}} is given by<ref name="JKB"/>
<math display="block">\operatorname{E}[X^n] = e^{n\mu + \frac{1}{2}n^2\sigma^2}.</math>

Specifically, the arithmetic mean, expected square, arithmetic variance, and arithmetic standard deviation of a log-normally distributed variable {{mvar|X}} are respectively given by:<ref name=":1" />

<math display="block">\begin{align}
 \operatorname{E}[X] & = e^{\mu + \tfrac{1}{2}\sigma^2}, \\[4pt]
 \operatorname{E}[X^2] & = e^{2\mu + 2\sigma^2}, \\[4pt]
 \operatorname{Var}[X] & = \operatorname{E}[X^2] - \operatorname{E}[X]^2
 = {\left(\operatorname{E}[X]\right)}^2 \left(e^{\sigma^2} - 1\right) \\[2pt]
 &= e^{2\mu + \sigma^2} \left(e^{\sigma^2} - 1\right), \\[4pt]
 \operatorname{SD}[X] & = \sqrt{\operatorname{Var}[X]}
 = \operatorname{E}[X] \sqrt{e^{\sigma^2} - 1} \\[2pt]
 &= e^{\mu + \tfrac{1}{2}\sigma^2} \sqrt{e^{\sigma^2} - 1},
\end{align}</math>

The arithmetic [[coefficient of variation]] <math>\operatorname{CV}[X]</math> is the ratio <math>\tfrac{\operatorname{SD}[X]}{\operatorname{E}[X]}</math>. For a log-normal distribution it is equal to<ref name=":2" />
<math display="block">\operatorname{CV}[X] = \sqrt{e^{\sigma^2} - 1}.</math>
This estimate is sometimes referred to as the "geometric CV" (GCV),<ref>Sawant, S.; Mohan, N. (2011) [http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf "FAQ: Issues with Efficacy Analysis of Clinical Trial Data Using SAS"] {{webarchive | url = https://web.archive.org/web/20110824094357/http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf | date = 24 August 2011 }}, ''PharmaSUG2011'', Paper PO08</ref><ref>{{cite journal | last1 = Schiff | first1 = MH | display-authors = etal | year = 2014 | title = Head-to-head, randomised, crossover study of oral versus subcutaneous methotrexate in patients with rheumatoid arthritis: drug-exposure limitations of oral methotrexate at doses >=15 mg may be overcome with subcutaneous administration | journal = Ann Rheum Dis | volume = 73 | issue = 8 | pages = 1–3 | doi = 10.1136/annrheumdis-2014-205228 | pmid = 24728329 | pmc = 4112421}}</ref> due to its use of the geometric variance. Contrary to the arithmetic standard deviation, the arithmetic coefficient of variation is independent of the arithmetic mean.

The parameters {{math|''μ''}} and {{math|''σ''}} can be obtained, if the arithmetic mean and the arithmetic variance are known:

<math display="block">\begin{align}
\mu &= \ln \frac{\operatorname{E}[X]^2}{\sqrt{\operatorname{E}[X^2]}}
= \ln \frac{\operatorname{E}[X]^2}{\sqrt{\operatorname{Var}[X] + \operatorname{E}[X]^2}}, \\[1ex]
 \sigma^2 &= \ln \frac{\operatorname{E}[X^2]}{\operatorname{E}[X]^2}
= \ln \left(1 + \frac{\operatorname{Var}[X]}{\operatorname{E}[X]^2}\right).
 \end{align}</math>

A probability distribution is not uniquely determined by the moments {{math|1=E[''X''<sup>''n''</sup>] = e<sup>''nμ'' + {{sfrac|1|2}}''n''<sup>2</sup>''σ''<sup>2</sup></sup>}} for {{math|''n'' ≥ 1}}. That is, there exist other distributions with the same set of moments.<ref name="JKB"/> In fact, there is a whole family of distributions with the same moments as the log-normal distribution.{{Citation needed|date=March 2012}}

=== Mode, median, quantiles ===
[[File:Comparison mean median mode.svg|thumb|upright=1.25|Comparison of [[mean]], [[median]] and [[mode (statistics)|mode]] of two log-normal distributions with different [[skewness]].]]
The [[mode (statistics)|mode]] is the point of global maximum of the probability density function. In particular, by solving the equation <math>(\ln f)'=0</math>, we get that:

<math display="block">\operatorname{Mode}[X] = e^{\mu - \sigma^2}.</math>

Since the [[logarithm transformation|log-transformed]] variable <math>Y = \ln X</math> has a normal distribution, and quantiles are preserved under monotonic transformations, the quantiles of <math>X</math> are

<math display="block">q_X(\alpha) = \exp\left[\mu + \sigma q_\Phi(\alpha)\right] = \mu^* (\sigma^*)^{q_\Phi(\alpha)},</math>

where <math>q_\Phi(\alpha)</math> is the quantile of the standard normal distribution.

Specifically, the median of a log-normal distribution is equal to its multiplicative mean,<ref>{{cite book | first1 = Leslie E. | last1 = Daly | first2 = Geoffrey Joseph | last2 = Bourke | year = 2000 | title = Interpretation and Uses of Medical Statistics | journal = Journal of Epidemiology and Community Health | volume = 46 | issue = 3 | edition = 5th | place = Oxford, UK | publisher = Wiley-Blackwell | isbn = 978-0-632-04763-5 | page = 89 | doi = 10.1002/9780470696750 | pmc = 1059583 <!-- | journal = Journal of Epidemiology and Community Health | volume = 46 | issue = 3 --- not found in WorldCat.org --> | postscript = ; }} print edition. Online eBook {{ISBN|9780470696750}}</ref>

<math display="block">\operatorname{Med}[X] = e^\mu = \mu^* ~.</math>

=== Partial expectation ===

The partial expectation of a random variable <math>X</math> with respect to a threshold <math>k</math> is defined as

<math display="block"> g(k) = \int_k^\infty x \, f_X(x \mid X > k)\, dx . </math>

Alternatively, by using the definition of [[conditional expectation]], it can be written as <math>g(k) = \operatorname{E}[X\mid X>k] \Pr(X>k)</math>. For a log-normal random variable, the partial expectation is given by:

<math display="block">\begin{align}
g(k) &= \int_k^\infty x f_X(x \mid X > k)\, dx \\[1ex]
&= e^{\mu+\tfrac{1}{2} \sigma^2}\, \Phi{\left(\frac{\mu-\ln k}{\sigma} - \sigma\right)}
\end{align} </math>

where <math>\Phi</math> is the [[normal cumulative distribution function]]. The derivation of the formula is provided in the [[Talk:Log-normal distribution|Talk page]]. The partial expectation formula has applications in [[insurance]] and [[economics]], it is used in solving the partial differential equation leading to the [[Black–Scholes formula]].

=== Conditional expectation ===

The conditional expectation of a log-normal random variable <math>X</math>—with respect to a threshold <math>k</math>—is its partial expectation divided by the cumulative probability of being in that range:

<math display="block">\begin{align}
\operatorname{E}[X\mid X<k] & = e^{\mu +\frac{\sigma^2}{2}} \cdot \frac{\Phi {\left[\frac{\ln k - \mu}{\sigma} - \sigma \right]}}{\Phi {\left[\frac{\ln k-\mu}{\sigma} \right]}} \\[8pt]
\operatorname{E}[X \mid X \geq k] &= e^{\mu +\frac{\sigma^2}{2}} \cdot \frac{\Phi {\left[\frac{\mu - \ln k}{\sigma} + \sigma \right]}}{1 - \Phi {\left[\frac{\ln k -\mu}{\sigma}\right]}} \\[8pt]
\operatorname{E}[X\mid X\in [k_1,k_2]] &= e^{\mu +\frac{\sigma^2}{2}} \cdot 
\frac{
 \Phi{\left[\frac{\ln k_2 - \mu}{\sigma} - \sigma \right]} - \Phi{\left[\frac{\ln k_1 - \mu}{\sigma} - \sigma\right]}
}{
 \Phi \left[\frac{\ln k_2 - \mu}{\sigma}\right]-\Phi \left[\frac{\ln k_1 - \mu}{\sigma}\right]
}
\end{align}</math>

=== Alternative parameterizations ===
In addition to the characterization by <math>\mu, \sigma</math> or <math>\mu^*, \sigma^*</math>, here are multiple ways how the log-normal distribution can be parameterized. [[ProbOnto]], the knowledge base and ontology of [[probability distribution]]s<ref>{{cite web | url = http://www.probonto.org | title = ProbOnto |access-date = 1 July 2017}}</ref><ref>{{cite journal | pmid = 27153608 | doi = 10.1093/bioinformatics/btw170 | pmc = 5013898 | volume = 32 | issue = 17 | pages = 2719–2721 | title = ProbOnto: ontology and knowledge base of probability distributions | year = 2016 | journal = Bioinformatics | last1 = Swat | first1 = MJ | last2 = Grenon | first2 = P | last3 = Wimalaratne | first3 = S}}</ref> lists seven such forms: [[File:LogNormal17.jpg|thumb|400px|Overview of parameterizations of the log-normal distributions.]]

* {{math|LogNormal1(''μ'',''σ'')}} with [[mean]], {{math|''μ''}}, and [[standard deviation]], {{math|''σ''}}, both on the log-scale <ref name="Forbes">Forbes et al. Probability Distributions (2011), John Wiley & Sons, Inc.</ref> <math display="block">P(x;\boldsymbol\mu,\boldsymbol\sigma) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp\left[-\frac{(\ln x - \mu)^2}{2 \sigma^2}\right]</math>
* {{math|LogNormal2(''μ'',''υ'')}} with mean, {{math|''μ''}}, and variance, {{math|''υ''}}, both on the log-scale <math display="block">P(x;\boldsymbol\mu,\boldsymbol {v}) = \frac{1}{x \sqrt{v} \sqrt{2 \pi}} \exp\left[-\frac{(\ln x - \mu)^2}{2 v}\right]</math>
* {{math|LogNormal3(''m'',''σ'')}} with [[median]], {{math|''m''}}, on the natural scale and standard deviation, {{math|''σ''}}, on the log-scale<ref name="Forbes" /> <math display="block">P(x;\boldsymbol m,\boldsymbol \sigma) =\frac{1}{x \sigma \sqrt{2 \pi}} \exp\left[-\frac{\ln^2(x/m)}{2 \sigma^2}\right]</math>
* {{math|LogNormal4(''m'',cv)}} with median, {{math|''m''}}, and [[coefficient of variation]], {{math|cv}}, both on the natural scale <math display="block">P(x;\boldsymbol m,\boldsymbol {cv}) = \frac{1}{x \sqrt{\ln(cv^2+1)} \sqrt{2 \pi}} \exp\left[-\frac{\ln^2(x/m)}{2\ln(cv^2+1)}\right]</math>
* {{math|LogNormal5(''μ'',''τ'')}} with mean, {{math|''μ''}}, and [[Precision (statistics)|precision]], {{math|''τ''}}, both on the log-scale<ref>Lunn, D. (2012). The BUGS book: a practical introduction to Bayesian analysis. Texts in statistical science. CRC Press.</ref> <math display="block">P(x;\boldsymbol\mu,\boldsymbol \tau) = \sqrt{\frac{\tau}{2 \pi}} \frac{1}{x} \exp\left[-\frac{\tau}{2}(\ln x-\mu)^2\right]</math>
* {{math|LogNormal6(''m'',''σ<sub>g</sub>'')}} with median, {{math|''m''}}, and [[geometric standard deviation]], {{math|''σ<sub>g</sub>''}}, both on the natural scale<ref>{{cite journal | last1 = Limpert | first1 = E. | last2 = Stahel | first2 = W. A. | last3 = Abbt | first3 = M. | year = 2001 | title = Log-normal distributions across the sciences: Keys and clues | journal = BioScience | volume = 51 | issue = 5 | pages = 341–352 | doi = 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2 | doi-access = free }}</ref> <math display="block">
P(x;\boldsymbol m,\boldsymbol {\sigma_g}) = \frac{1}{x \sqrt{2 \pi} \, \ln\sigma_g} \exp\left[-\frac{\ln^2(x/m)}{2 \ln^2(\sigma_g)}\right]</math>
* {{math|LogNormal7(''μ<sub>N</sub>'',''σ<sub>N</sub>'')}} with mean, {{math|''μ<sub>N</sub>''}}, and standard deviation, {{math|''σ<sub>N</sub>''}}, both on the natural scale<ref>{{cite journal | last1 = Nyberg | first1 = J. | display-authors = etal | year = 2012 | title = PopED – An extended, parallelized, population optimal design tool | journal = Comput Methods Programs Biomed | volume = 108 | issue = 2 | pages = 789–805 | doi = 10.1016/j.cmpb.2012.05.005 | pmid = 22640817 }}</ref> <math display="block">P(x;\boldsymbol {\mu_N},\boldsymbol {\sigma_N}) = \frac{1}{x \sqrt{2 \pi \ln\left(1+\sigma_N^2/\mu_N^2\right)}} \exp\left[-\frac{\left( \ln x - \ln\frac{\mu_N}{\sqrt{1 + \sigma_N^2/\mu_N^2}}\right)^2}{2 \ln\left(1 + \frac{\sigma_N^2}{\mu_N^2}\right)}\right]</math>

==== Examples for re-parameterization ====
Consider the situation when one would like to run a model using two different optimal design tools, for example PFIM<ref>{{cite journal | last1 = Retout | first1 = S | last2 = Duffull | first2 = S | last3 = Mentré | first3 = F | year = 2001 | title = Development and implementation of the population Fisher information matrix for the evaluation of population pharmacokinetic designs | journal = Comp Meth Pro Biomed | volume = 65 | issue = 2 | pages = 141–151 | doi = 10.1016/S0169-2607(00)00117-6 | pmid = 11275334 }}</ref> and PopED.<ref>The PopED Development Team (2016). PopED Manual, Release version 2.13. Technical report, Uppsala University.</ref> The former supports the LN2, the latter LN7 parameterization, respectively. Therefore, the re-parameterization is required, otherwise the two tools would produce different results.

For the transition <math>\operatorname{LN2}(\mu, v) \to \operatorname{LN7}(\mu_N, \sigma_N)</math> following formulas hold <math display="inline">\mu_N = \exp(\mu+v/2) </math> and <math display="inline">\sigma_N = \exp(\mu+v/2)\sqrt{\exp(v)-1}</math>.

For the transition <math>\operatorname{LN7}(\mu_N, \sigma_N) \to \operatorname{LN2}(\mu, v)</math> following formulas hold <math display="inline">\mu = \ln \mu_N - \frac{1}{2} v </math> and <math display="inline"> v = \ln(1+\sigma_N^2/\mu_N^2)</math>.

All remaining re-parameterisation formulas can be found in the specification document on the project website.<ref name="probontoWebsite">ProbOnto website, URL: http://probonto.org</ref>

=== Multiple, reciprocal, power ===
* Multiplication by a constant: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>a X \sim \operatorname{Lognormal}( \mu + \ln a, \sigma^2)</math> for <math> a > 0. </math>
* Reciprocal: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>\tfrac{1}{X} \sim \operatorname{Lognormal}(-\mu, \sigma^2).</math>
* Power: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>X^a \sim \operatorname{Lognormal}(a\mu, a^2 \sigma^2)</math> for <math>a \neq 0.</math>

=== Multiplication and division of independent, log-normal random variables ===
If two [[statistical independence|independent]], log-normal variables <math>X_1</math> and <math>X_2</math> are multiplied [divided], the product [ratio] is again log-normal, with parameters <math>\mu = \mu_1 + \mu_2</math> {{nowrap|[<math>\mu = \mu_1-\mu_2</math>]}} and {{nowrap|<math>\sigma</math>,}} where {{nowrap|<math>\sigma^2 = \sigma_1^2 + \sigma_2^2</math>.}}

More generally, if <math>X_j \sim \operatorname{Lognormal} (\mu_j, \sigma_j^2)</math> are <math>n</math> independent, log-normally distributed variables, then <math display="inline">Y = \prod_{j=1}^n X_j \sim \operatorname{Lognormal} \Big( \sum_{j=1}^n\mu_j, \sum_{j=1}^n \sigma_j^2 \Big).</math>

=== <span class="anchor" id="Multiplicative Central Limit Theorem"></span>Multiplicative central limit theorem ===
{{See also|Gibrat's law}}

The geometric or multiplicative mean of <math>n</math> independent, identically distributed, positive random variables <math>X_i</math> shows, for <math>n \to \infty</math>, approximately a log-normal distribution with parameters <math>\mu = \operatorname{E}[\ln X_i]</math> and <math>\sigma^2 = \operatorname{var}[\ln X_i ]/n</math>, assuming <math>\sigma^2</math> is finite.

In fact, the random variables do not have to be identically distributed. It is enough for the distributions of <math>\ln X_i</math> to all have finite variance and satisfy the other conditions of any of the many variants of the [[central limit theorem]].

This is commonly known as [[Gibrat's law]].

=== Heavy-tailness of the Log-Normal ===
Whether a Log-Normal can be considered or not a true heavy-tail distribution is still debated. The main reason is that its variance is always finite, differently from what happen with certain Pareto distributions, for instance. However a recent study has shown how it is possible to create a Log-Normal distribution with infinite variance using Robinson Non-Standard Analysis.<ref>{{Cite journal | last1 = Cococcioni | first1 = Marco | last2 = Fiorini | first2 = Francesco | last3 = Pagano | first3 = Michele | date = 2023-04-06 | title = Modelling Heavy Tailed Phenomena Using a LogNormal Distribution Having a Numerically Verifiable Infinite Variance | journal = Mathematics | language = en | volume = 11 | issue = 7 | page = 1758 | doi = 10.3390/math11071758 | doi-access = free | issn = 2227-7390 | hdl = 11568/1216554 | hdl-access = free }}</ref>

=== Other ===

A set of data that arises from the log-normal distribution has a symmetric [[Lorenz curve]] (see also [[Lorenz asymmetry coefficient]]).<ref name="EcolgyArticle">{{cite journal | doi = 10.1890/0012-9658(2000)081[1139:DIIPSO]2.0.CO;2 | last1 = Damgaard | first1 = Christian | first2 = Jacob | last2 = Weiner | title = Describing inequality in plant size or fecundity | journal = Ecology | year = 2000 | volume = 81 | issue = 4 | pages = 1139–1142 }}</ref>

The harmonic <math>H</math>, geometric <math>G</math> and arithmetic <math>A</math> means of this distribution are related;<ref name="Rossman1990">{{cite journal | last = Rossman | first = Lewis A | date = July 1990 | title = Design stream flows based on harmonic means | journal = Journal of Hydraulic Engineering | volume = 116 | issue = 7 | pages = 946–950 | doi = 10.1061/(ASCE)0733-9429(1990)116:7(946)}}</ref> such relation is given by

<math display="block">H = \frac{G^2} A.</math>

Log-normal distributions are [[infinite divisibility (probability)|infinitely divisible]],<ref name="OlofThorin1978LNInfDivi"/> but they are not [[stable distribution]]s, which can be easily drawn from.<ref name="Gao"/>