Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Log-normal distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Properties == [[File:Probabilities of log normal.png|thumb|right|a. <math>y</math> is a log-normal variable with {{nowrap|<math>\mu = 1</math>,}} {{nowrap|<math>\sigma = 0.5</math>.}} <math>p(\sin y>0)</math> is computed by transforming to the normal variable <math>x = \ln y</math>, then integrating its density over the domain defined by <math>\sin e^x>0</math> (blue regions), using the numerical method of ray-tracing.<ref name="Das" /> b & c. The pdf and cdf of the function <math> \sin y</math> of the log-normal variable can also be computed in this way.]] ===Probability in different domains=== The probability content of a log-normal distribution in any arbitrary domain can be computed to desired precision by first transforming the variable to normal, then numerically integrating using the ray-trace method.<ref name="Das">{{cite journal | last = Das | first = Abhranil | arxiv = 2012.14331 | title = A method to integrate and classify normal distributions | journal = Journal of Vision | date = 2021 | volume = 21 | issue = 10 | page = 1 | doi = 10.1167/jov.21.10.1 | pmid = 34468706 | pmc = 8419883 }}</ref> ([https://www.mathworks.com/matlabcentral/fileexchange/84973-integrate-and-classify-normal-distributions Matlab code]) ===Probabilities of functions of a log-normal variable=== Since the probability of a log-normal can be computed in any domain, this means that the cdf (and consequently pdf and inverse cdf) of any function of a log-normal variable can also be computed.<ref name="Das"/> ([https://www.mathworks.com/matlabcentral/fileexchange/84973-integrate-and-classify-normal-distributions Matlab code]) ===Geometric or multiplicative moments=== The [[geometric mean|geometric or multiplicative mean]] of the log-normal distribution is <math>\operatorname{GM}[X] = e^\mu = \mu^*</math>. It equals the median. The [[geometric standard deviation|geometric or multiplicative standard deviation]] is <math>\operatorname{GSD}[X] = e^{\sigma} = \sigma^*</math>.<ref name="ReferenceA">{{cite journal | last1 = Kirkwood | first1 = Thomas BL | title = Geometric means and measures of dispersion | journal = Biometrics | date = Dec 1979 | volume = 35 | issue = 4 | pages = 908–909 | jstor = 2530139 }}</ref><ref>{{cite journal | last1 = Limpert | first1 = E | last2 = Stahel | first2 = W | last3 = Abbt | first3 = M | title = Lognormal distributions across the sciences: keys and clues | journal = BioScience | year = 2001 | volume = 51 | issue = 5 | pages = 341–352 | doi = 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2 | doi-access = free }}</ref> By analogy with the arithmetic statistics, one can define a geometric variance, <math>\operatorname{GVar}[X] = e^{\sigma^2}</math>, and a [[Coefficient of variation#Log-normal data|geometric coefficient of variation]],<ref name="ReferenceA" /> <math>\operatorname{GCV}[X] = e^{\sigma} - 1</math>, has been proposed. This term was intended to be ''analogous'' to the coefficient of variation, for describing multiplicative variation in log-normal data, but this definition of GCV has no theoretical basis as an estimate of <math>\operatorname{CV}</math> itself (see also [[Coefficient of variation]]). Note that the geometric mean is smaller than the arithmetic mean. This is due to the [[AM–GM inequality]] and is a consequence of the logarithm being a [[concave function]]. In fact,<ref name="Acoustic Stimuli Revisited 2016">{{cite journal | last1 = Heil P | first1 = Friedrich B | title = Onset-Duration Matching of Acoustic Stimuli Revisited: Conventional Arithmetic vs. Proposed Geometric Measures of Accuracy and Precision | journal = Frontiers in Psychology | volume = 7 | page = 2013 | doi = 10.3389/fpsyg.2016.02013 | pmid = 28111557 | pmc = 5216879 | year = 2017 | doi-access = free}}</ref> <math display="block">\operatorname{E}[X] = e^{\mu + \frac12 \sigma^2} = e^{\mu} \cdot \sqrt{e^{\sigma^2}} = \operatorname{GM}[X] \cdot \sqrt{\operatorname{GVar}[X]}.</math> In finance, the term <math>e^{-\sigma^2/2}</math> is sometimes interpreted as a [[convexity correction]]. From the point of view of [[stochastic calculus]], this is the same correction term as in [[Itō's lemma#Geometric Brownian motion|Itō's lemma for geometric Brownian motion]]. ===Arithmetic moments=== For any real or complex number {{mvar|n}}, the {{mvar|n}}-th [[moment (mathematics)|moment]] of a log-normally distributed variable {{mvar|X}} is given by<ref name="JKB"/> <math display="block">\operatorname{E}[X^n] = e^{n\mu + \frac{1}{2}n^2\sigma^2}.</math> Specifically, the arithmetic mean, expected square, arithmetic variance, and arithmetic standard deviation of a log-normally distributed variable {{mvar|X}} are respectively given by:<ref name=":1" /> <math display="block">\begin{align} \operatorname{E}[X] & = e^{\mu + \tfrac{1}{2}\sigma^2}, \\[4pt] \operatorname{E}[X^2] & = e^{2\mu + 2\sigma^2}, \\[4pt] \operatorname{Var}[X] & = \operatorname{E}[X^2] - \operatorname{E}[X]^2 = {\left(\operatorname{E}[X]\right)}^2 \left(e^{\sigma^2} - 1\right) \\[2pt] &= e^{2\mu + \sigma^2} \left(e^{\sigma^2} - 1\right), \\[4pt] \operatorname{SD}[X] & = \sqrt{\operatorname{Var}[X]} = \operatorname{E}[X] \sqrt{e^{\sigma^2} - 1} \\[2pt] &= e^{\mu + \tfrac{1}{2}\sigma^2} \sqrt{e^{\sigma^2} - 1}, \end{align}</math> The arithmetic [[coefficient of variation]] <math>\operatorname{CV}[X]</math> is the ratio <math>\tfrac{\operatorname{SD}[X]}{\operatorname{E}[X]}</math>. For a log-normal distribution it is equal to<ref name=":2" /> <math display="block">\operatorname{CV}[X] = \sqrt{e^{\sigma^2} - 1}.</math> This estimate is sometimes referred to as the "geometric CV" (GCV),<ref>Sawant, S.; Mohan, N. (2011) [http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf "FAQ: Issues with Efficacy Analysis of Clinical Trial Data Using SAS"] {{webarchive | url = https://web.archive.org/web/20110824094357/http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf | date = 24 August 2011 }}, ''PharmaSUG2011'', Paper PO08</ref><ref>{{cite journal | last1 = Schiff | first1 = MH | display-authors = etal | year = 2014 | title = Head-to-head, randomised, crossover study of oral versus subcutaneous methotrexate in patients with rheumatoid arthritis: drug-exposure limitations of oral methotrexate at doses >=15 mg may be overcome with subcutaneous administration | journal = Ann Rheum Dis | volume = 73 | issue = 8 | pages = 1–3 | doi = 10.1136/annrheumdis-2014-205228 | pmid = 24728329 | pmc = 4112421}}</ref> due to its use of the geometric variance. Contrary to the arithmetic standard deviation, the arithmetic coefficient of variation is independent of the arithmetic mean. The parameters {{math|''μ''}} and {{math|''σ''}} can be obtained, if the arithmetic mean and the arithmetic variance are known: <math display="block">\begin{align} \mu &= \ln \frac{\operatorname{E}[X]^2}{\sqrt{\operatorname{E}[X^2]}} = \ln \frac{\operatorname{E}[X]^2}{\sqrt{\operatorname{Var}[X] + \operatorname{E}[X]^2}}, \\[1ex] \sigma^2 &= \ln \frac{\operatorname{E}[X^2]}{\operatorname{E}[X]^2} = \ln \left(1 + \frac{\operatorname{Var}[X]}{\operatorname{E}[X]^2}\right). \end{align}</math> A probability distribution is not uniquely determined by the moments {{math|1=E[''X''<sup>''n''</sup>] = e<sup>''nμ'' + {{sfrac|1|2}}''n''<sup>2</sup>''σ''<sup>2</sup></sup>}} for {{math|''n'' ≥ 1}}. That is, there exist other distributions with the same set of moments.<ref name="JKB"/> In fact, there is a whole family of distributions with the same moments as the log-normal distribution.{{Citation needed|date=March 2012}} === Mode, median, quantiles === [[File:Comparison mean median mode.svg|thumb|upright=1.25|Comparison of [[mean]], [[median]] and [[mode (statistics)|mode]] of two log-normal distributions with different [[skewness]].]] The [[mode (statistics)|mode]] is the point of global maximum of the probability density function. In particular, by solving the equation <math>(\ln f)'=0</math>, we get that: <math display="block">\operatorname{Mode}[X] = e^{\mu - \sigma^2}.</math> Since the [[logarithm transformation|log-transformed]] variable <math>Y = \ln X</math> has a normal distribution, and quantiles are preserved under monotonic transformations, the quantiles of <math>X</math> are <math display="block">q_X(\alpha) = \exp\left[\mu + \sigma q_\Phi(\alpha)\right] = \mu^* (\sigma^*)^{q_\Phi(\alpha)},</math> where <math>q_\Phi(\alpha)</math> is the quantile of the standard normal distribution. Specifically, the median of a log-normal distribution is equal to its multiplicative mean,<ref>{{cite book | first1 = Leslie E. | last1 = Daly | first2 = Geoffrey Joseph | last2 = Bourke | year = 2000 | title = Interpretation and Uses of Medical Statistics | journal = Journal of Epidemiology and Community Health | volume = 46 | issue = 3 | edition = 5th | place = Oxford, UK | publisher = Wiley-Blackwell | isbn = 978-0-632-04763-5 | page = 89 | doi = 10.1002/9780470696750 | pmc = 1059583 <!-- | journal = Journal of Epidemiology and Community Health | volume = 46 | issue = 3 --- not found in WorldCat.org --> | postscript = ; }} print edition. Online eBook {{ISBN|9780470696750}}</ref> <math display="block">\operatorname{Med}[X] = e^\mu = \mu^* ~.</math> === Partial expectation === The partial expectation of a random variable <math>X</math> with respect to a threshold <math>k</math> is defined as <math display="block"> g(k) = \int_k^\infty x \, f_X(x \mid X > k)\, dx . </math> Alternatively, by using the definition of [[conditional expectation]], it can be written as <math>g(k) = \operatorname{E}[X\mid X>k] \Pr(X>k)</math>. For a log-normal random variable, the partial expectation is given by: <math display="block">\begin{align} g(k) &= \int_k^\infty x f_X(x \mid X > k)\, dx \\[1ex] &= e^{\mu+\tfrac{1}{2} \sigma^2}\, \Phi{\left(\frac{\mu-\ln k}{\sigma} - \sigma\right)} \end{align} </math> where <math>\Phi</math> is the [[normal cumulative distribution function]]. The derivation of the formula is provided in the [[Talk:Log-normal distribution|Talk page]]. The partial expectation formula has applications in [[insurance]] and [[economics]], it is used in solving the partial differential equation leading to the [[Black–Scholes formula]]. === Conditional expectation === The conditional expectation of a log-normal random variable <math>X</math>—with respect to a threshold <math>k</math>—is its partial expectation divided by the cumulative probability of being in that range: <math display="block">\begin{align} \operatorname{E}[X\mid X<k] & = e^{\mu +\frac{\sigma^2}{2}} \cdot \frac{\Phi {\left[\frac{\ln k - \mu}{\sigma} - \sigma \right]}}{\Phi {\left[\frac{\ln k-\mu}{\sigma} \right]}} \\[8pt] \operatorname{E}[X \mid X \geq k] &= e^{\mu +\frac{\sigma^2}{2}} \cdot \frac{\Phi {\left[\frac{\mu - \ln k}{\sigma} + \sigma \right]}}{1 - \Phi {\left[\frac{\ln k -\mu}{\sigma}\right]}} \\[8pt] \operatorname{E}[X\mid X\in [k_1,k_2]] &= e^{\mu +\frac{\sigma^2}{2}} \cdot \frac{ \Phi{\left[\frac{\ln k_2 - \mu}{\sigma} - \sigma \right]} - \Phi{\left[\frac{\ln k_1 - \mu}{\sigma} - \sigma\right]} }{ \Phi \left[\frac{\ln k_2 - \mu}{\sigma}\right]-\Phi \left[\frac{\ln k_1 - \mu}{\sigma}\right] } \end{align}</math> === Alternative parameterizations === In addition to the characterization by <math>\mu, \sigma</math> or <math>\mu^*, \sigma^*</math>, here are multiple ways how the log-normal distribution can be parameterized. [[ProbOnto]], the knowledge base and ontology of [[probability distribution]]s<ref>{{cite web | url = http://www.probonto.org | title = ProbOnto |access-date = 1 July 2017}}</ref><ref>{{cite journal | pmid = 27153608 | doi = 10.1093/bioinformatics/btw170 | pmc = 5013898 | volume = 32 | issue = 17 | pages = 2719–2721 | title = ProbOnto: ontology and knowledge base of probability distributions | year = 2016 | journal = Bioinformatics | last1 = Swat | first1 = MJ | last2 = Grenon | first2 = P | last3 = Wimalaratne | first3 = S}}</ref> lists seven such forms: [[File:LogNormal17.jpg|thumb|400px|Overview of parameterizations of the log-normal distributions.]] * {{math|LogNormal1(''μ'',''σ'')}} with [[mean]], {{math|''μ''}}, and [[standard deviation]], {{math|''σ''}}, both on the log-scale <ref name="Forbes">Forbes et al. Probability Distributions (2011), John Wiley & Sons, Inc.</ref> <math display="block">P(x;\boldsymbol\mu,\boldsymbol\sigma) = \frac{1}{x \sigma \sqrt{2 \pi}} \exp\left[-\frac{(\ln x - \mu)^2}{2 \sigma^2}\right]</math> * {{math|LogNormal2(''μ'',''υ'')}} with mean, {{math|''μ''}}, and variance, {{math|''υ''}}, both on the log-scale <math display="block">P(x;\boldsymbol\mu,\boldsymbol {v}) = \frac{1}{x \sqrt{v} \sqrt{2 \pi}} \exp\left[-\frac{(\ln x - \mu)^2}{2 v}\right]</math> * {{math|LogNormal3(''m'',''σ'')}} with [[median]], {{math|''m''}}, on the natural scale and standard deviation, {{math|''σ''}}, on the log-scale<ref name="Forbes" /> <math display="block">P(x;\boldsymbol m,\boldsymbol \sigma) =\frac{1}{x \sigma \sqrt{2 \pi}} \exp\left[-\frac{\ln^2(x/m)}{2 \sigma^2}\right]</math> * {{math|LogNormal4(''m'',cv)}} with median, {{math|''m''}}, and [[coefficient of variation]], {{math|cv}}, both on the natural scale <math display="block">P(x;\boldsymbol m,\boldsymbol {cv}) = \frac{1}{x \sqrt{\ln(cv^2+1)} \sqrt{2 \pi}} \exp\left[-\frac{\ln^2(x/m)}{2\ln(cv^2+1)}\right]</math> * {{math|LogNormal5(''μ'',''τ'')}} with mean, {{math|''μ''}}, and [[Precision (statistics)|precision]], {{math|''τ''}}, both on the log-scale<ref>Lunn, D. (2012). The BUGS book: a practical introduction to Bayesian analysis. Texts in statistical science. CRC Press.</ref> <math display="block">P(x;\boldsymbol\mu,\boldsymbol \tau) = \sqrt{\frac{\tau}{2 \pi}} \frac{1}{x} \exp\left[-\frac{\tau}{2}(\ln x-\mu)^2\right]</math> * {{math|LogNormal6(''m'',''σ<sub>g</sub>'')}} with median, {{math|''m''}}, and [[geometric standard deviation]], {{math|''σ<sub>g</sub>''}}, both on the natural scale<ref>{{cite journal | last1 = Limpert | first1 = E. | last2 = Stahel | first2 = W. A. | last3 = Abbt | first3 = M. | year = 2001 | title = Log-normal distributions across the sciences: Keys and clues | journal = BioScience | volume = 51 | issue = 5 | pages = 341–352 | doi = 10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2 | doi-access = free }}</ref> <math display="block"> P(x;\boldsymbol m,\boldsymbol {\sigma_g}) = \frac{1}{x \sqrt{2 \pi} \, \ln\sigma_g} \exp\left[-\frac{\ln^2(x/m)}{2 \ln^2(\sigma_g)}\right]</math> * {{math|LogNormal7(''μ<sub>N</sub>'',''σ<sub>N</sub>'')}} with mean, {{math|''μ<sub>N</sub>''}}, and standard deviation, {{math|''σ<sub>N</sub>''}}, both on the natural scale<ref>{{cite journal | last1 = Nyberg | first1 = J. | display-authors = etal | year = 2012 | title = PopED – An extended, parallelized, population optimal design tool | journal = Comput Methods Programs Biomed | volume = 108 | issue = 2 | pages = 789–805 | doi = 10.1016/j.cmpb.2012.05.005 | pmid = 22640817 }}</ref> <math display="block">P(x;\boldsymbol {\mu_N},\boldsymbol {\sigma_N}) = \frac{1}{x \sqrt{2 \pi \ln\left(1+\sigma_N^2/\mu_N^2\right)}} \exp\left[-\frac{\left( \ln x - \ln\frac{\mu_N}{\sqrt{1 + \sigma_N^2/\mu_N^2}}\right)^2}{2 \ln\left(1 + \frac{\sigma_N^2}{\mu_N^2}\right)}\right]</math> ==== Examples for re-parameterization ==== Consider the situation when one would like to run a model using two different optimal design tools, for example PFIM<ref>{{cite journal | last1 = Retout | first1 = S | last2 = Duffull | first2 = S | last3 = Mentré | first3 = F | year = 2001 | title = Development and implementation of the population Fisher information matrix for the evaluation of population pharmacokinetic designs | journal = Comp Meth Pro Biomed | volume = 65 | issue = 2 | pages = 141–151 | doi = 10.1016/S0169-2607(00)00117-6 | pmid = 11275334 }}</ref> and PopED.<ref>The PopED Development Team (2016). PopED Manual, Release version 2.13. Technical report, Uppsala University.</ref> The former supports the LN2, the latter LN7 parameterization, respectively. Therefore, the re-parameterization is required, otherwise the two tools would produce different results. For the transition <math>\operatorname{LN2}(\mu, v) \to \operatorname{LN7}(\mu_N, \sigma_N)</math> following formulas hold <math display="inline">\mu_N = \exp(\mu+v/2) </math> and <math display="inline">\sigma_N = \exp(\mu+v/2)\sqrt{\exp(v)-1}</math>. For the transition <math>\operatorname{LN7}(\mu_N, \sigma_N) \to \operatorname{LN2}(\mu, v)</math> following formulas hold <math display="inline">\mu = \ln \mu_N - \frac{1}{2} v </math> and <math display="inline"> v = \ln(1+\sigma_N^2/\mu_N^2)</math>. All remaining re-parameterisation formulas can be found in the specification document on the project website.<ref name="probontoWebsite">ProbOnto website, URL: http://probonto.org</ref> === Multiple, reciprocal, power === * Multiplication by a constant: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>a X \sim \operatorname{Lognormal}( \mu + \ln a, \sigma^2)</math> for <math> a > 0. </math> * Reciprocal: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>\tfrac{1}{X} \sim \operatorname{Lognormal}(-\mu, \sigma^2).</math> * Power: If <math>X \sim \operatorname{Lognormal}(\mu, \sigma^2)</math> then <math>X^a \sim \operatorname{Lognormal}(a\mu, a^2 \sigma^2)</math> for <math>a \neq 0.</math> === Multiplication and division of independent, log-normal random variables === If two [[statistical independence|independent]], log-normal variables <math>X_1</math> and <math>X_2</math> are multiplied [divided], the product [ratio] is again log-normal, with parameters <math>\mu = \mu_1 + \mu_2</math> {{nowrap|[<math>\mu = \mu_1-\mu_2</math>]}} and {{nowrap|<math>\sigma</math>,}} where {{nowrap|<math>\sigma^2 = \sigma_1^2 + \sigma_2^2</math>.}} More generally, if <math>X_j \sim \operatorname{Lognormal} (\mu_j, \sigma_j^2)</math> are <math>n</math> independent, log-normally distributed variables, then <math display="inline">Y = \prod_{j=1}^n X_j \sim \operatorname{Lognormal} \Big( \sum_{j=1}^n\mu_j, \sum_{j=1}^n \sigma_j^2 \Big).</math> === <span class="anchor" id="Multiplicative Central Limit Theorem"></span>Multiplicative central limit theorem === {{See also|Gibrat's law}} The geometric or multiplicative mean of <math>n</math> independent, identically distributed, positive random variables <math>X_i</math> shows, for <math>n \to \infty</math>, approximately a log-normal distribution with parameters <math>\mu = \operatorname{E}[\ln X_i]</math> and <math>\sigma^2 = \operatorname{var}[\ln X_i ]/n</math>, assuming <math>\sigma^2</math> is finite. In fact, the random variables do not have to be identically distributed. It is enough for the distributions of <math>\ln X_i</math> to all have finite variance and satisfy the other conditions of any of the many variants of the [[central limit theorem]]. This is commonly known as [[Gibrat's law]]. === Heavy-tailness of the Log-Normal === Whether a Log-Normal can be considered or not a true heavy-tail distribution is still debated. The main reason is that its variance is always finite, differently from what happen with certain Pareto distributions, for instance. However a recent study has shown how it is possible to create a Log-Normal distribution with infinite variance using Robinson Non-Standard Analysis.<ref>{{Cite journal | last1 = Cococcioni | first1 = Marco | last2 = Fiorini | first2 = Francesco | last3 = Pagano | first3 = Michele | date = 2023-04-06 | title = Modelling Heavy Tailed Phenomena Using a LogNormal Distribution Having a Numerically Verifiable Infinite Variance | journal = Mathematics | language = en | volume = 11 | issue = 7 | page = 1758 | doi = 10.3390/math11071758 | doi-access = free | issn = 2227-7390 | hdl = 11568/1216554 | hdl-access = free }}</ref> === Other === A set of data that arises from the log-normal distribution has a symmetric [[Lorenz curve]] (see also [[Lorenz asymmetry coefficient]]).<ref name="EcolgyArticle">{{cite journal | doi = 10.1890/0012-9658(2000)081[1139:DIIPSO]2.0.CO;2 | last1 = Damgaard | first1 = Christian | first2 = Jacob | last2 = Weiner | title = Describing inequality in plant size or fecundity | journal = Ecology | year = 2000 | volume = 81 | issue = 4 | pages = 1139–1142 }}</ref> The harmonic <math>H</math>, geometric <math>G</math> and arithmetic <math>A</math> means of this distribution are related;<ref name="Rossman1990">{{cite journal | last = Rossman | first = Lewis A | date = July 1990 | title = Design stream flows based on harmonic means | journal = Journal of Hydraulic Engineering | volume = 116 | issue = 7 | pages = 946–950 | doi = 10.1061/(ASCE)0733-9429(1990)116:7(946)}}</ref> such relation is given by <math display="block">H = \frac{G^2} A.</math> Log-normal distributions are [[infinite divisibility (probability)|infinitely divisible]],<ref name="OlofThorin1978LNInfDivi"/> but they are not [[stable distribution]]s, which can be easily drawn from.<ref name="Gao"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)