Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Normal distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Generating values from normal distribution === [[File:Planche de Galton.jpg|thumb|250px|right|The [[bean machine]], a device invented by [[Francis Galton]], can be called the first generator of normal random variables. This machine consists of a vertical board with interleaved rows of pins. Small balls are dropped from the top and then bounce randomly left or right as they hit the pins. The balls are collected into bins at the bottom and settle down into a pattern resembling the Gaussian curve.]] In computer simulations, especially in applications of the [[Monte-Carlo method]], it is often desirable to generate values that are normally distributed. The algorithms listed below all generate the standard normal deviates, since a {{math|''N''(''μ'', ''σ''<sup>2</sup>)}} can be generated as {{math|1=''X'' = ''μ'' + ''σZ''}}, where ''Z'' is standard normal. All these algorithms rely on the availability of a [[random number generator]] ''U'' capable of producing [[Uniform distribution (continuous)|uniform]] random variates. * The most straightforward method is based on the [[probability integral transform]] property: if ''U'' is distributed uniformly on (0,1), then Φ<sup>−1</sup>(''U'') will have the standard normal distribution. The drawback of this method is that it relies on calculation of the [[probit function]] Φ<sup>−1</sup>, which cannot be done analytically. Some approximate methods are described in {{harvtxt |Hart |1968 }} and in the [[error function|erf]] article. Wichura gives a fast algorithm for computing this function to 16 decimal places,<ref>{{cite journal|last=Wichura |first=Michael J.|year=1988|title=Algorithm AS241: The Percentage Points of the Normal Distribution|journal=Applied Statistics |volume=37|pages=477–84|doi=10.2307/2347330|jstor=2347330|issue=3}}</ref> which is used by [[R programming language|R]] to compute random variates of the normal distribution. * [[Irwin–Hall distribution#Approximating a Normal distribution|An easy-to-program approximate approach]] that relies on the [[central limit theorem]] is as follows: generate 12 uniform ''U''(0,1) deviates, add them all up, and subtract 6 – the resulting random variable will have approximately standard normal distribution. In truth, the distribution will be [[Irwin–Hall distribution|Irwin–Hall]], which is a 12-section eleventh-order polynomial approximation to the normal distribution. This random deviate will have a limited range of (−6, 6).<ref>{{harvtxt |Johnson |Kotz |Balakrishnan |1995 |loc=Equation (26.48) }}</ref> Note that in a true normal distribution, only 0.00034% of all samples will fall outside ±6σ. * The [[Box–Muller method]] uses two independent random numbers ''U'' and ''V'' distributed [[uniform distribution (continuous)|uniformly]] on (0,1). Then the two random variables ''X'' and ''Y'' <math display=block> X = \sqrt{- 2 \ln U} \, \cos(2 \pi V) , \qquad Y = \sqrt{- 2 \ln U} \, \sin(2 \pi V) . </math> will both have the standard normal distribution, and will be [[independence (probability theory)|independent]]. This formulation arises because for a [[bivariate normal]] random vector (''X'', ''Y'') the squared norm {{math|''X''<sup>2</sup> + ''Y''<sup>2</sup>}} will have the [[chi-squared distribution]] with two degrees of freedom, which is an easily generated [[exponential random variable]] corresponding to the quantity −2 ln(''U'') in these equations; and the angle is distributed uniformly around the circle, chosen by the random variable ''V''. * The [[Marsaglia polar method]] is a modification of the Box–Muller method which does not require computation of the sine and cosine functions. In this method, ''U'' and ''V'' are drawn from the uniform (−1,1) distribution, and then {{math|1=''S'' = ''U''<sup>2</sup> + ''V''<sup>2</sup>}} is computed. If ''S'' is greater or equal to 1, then the method starts over, otherwise the two quantities <math display=block>X = U\sqrt{\frac{-2\ln S}{S}}, \qquad Y = V\sqrt{\frac{-2\ln S}{S}}</math> are returned. Again, ''X'' and ''Y'' are independent, standard normal random variables. * The Ratio method<ref>{{harvtxt |Kinderman |Monahan |1977 }}</ref> is a rejection method. The algorithm proceeds as follows: ** Generate two independent uniform deviates ''U'' and ''V''; ** Compute ''X'' = {{sqrt|8/''e''}} (''V'' − 0.5)/''U''; ** Optional: if ''X''<sup>2</sup> ≤ 5 − 4''e''<sup>1/4</sup>''U'' then accept ''X'' and terminate algorithm; ** Optional: if ''X''<sup>2</sup> ≥ 4''e''<sup>−1.35</sup>/''U'' + 1.4 then reject ''X'' and start over from step 1; ** If ''X''<sup>2</sup> ≤ −4 ln''U'' then accept ''X'', otherwise start over the algorithm. *: The two optional steps allow the evaluation of the logarithm in the last step to be avoided in most cases. These steps can be greatly improved<ref>{{harvtxt|Leva|1992}}</ref> so that the logarithm is rarely evaluated. * The [[ziggurat algorithm]]<ref>{{harvtxt |Marsaglia |Tsang |2000 }}</ref> is faster than the Box–Muller transform and still exact. In about 97% of all cases it uses only two random numbers, one random integer and one random uniform, one multiplication and an if-test. Only in 3% of the cases, where the combination of those two falls outside the "core of the ziggurat" (a kind of rejection sampling using logarithms), do exponentials and more uniform random numbers have to be employed. * Integer arithmetic can be used to sample from the standard normal distribution.<ref>{{harvtxt|Karney|2016}}</ref><ref>{{harvtxt|Du|Fan|Wei|2022}}</ref> This method is exact in the sense that it satisfies the conditions of ''ideal approximation'';<ref>{{harvtxt|Monahan|1985|loc=section 2}}</ref> i.e., it is equivalent to sampling a real number from the standard normal distribution and rounding this to the nearest representable floating point number. * There is also some investigation<ref>{{harvtxt |Wallace |1996}}</ref> into the connection between the fast [[Hadamard transform]] and the normal distribution, since the transform employs just addition and subtraction and by the central limit theorem random numbers from almost any distribution will be transformed into the normal distribution. In this regard a series of Hadamard transforms can be combined with random permutations to turn arbitrary data sets into a normally distributed data.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)