Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Multinomial distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==== Concentration at large ''n'' ==== Due to the exponential decay, at large <math>n</math>, almost all the probability mass is concentrated in a small neighborhood of <math>p</math>. In this small neighborhood, we can take the first nonzero term in the Taylor expansion of <math>D_{KL}</math>, to obtain<math display="block">\ln \binom{n}{x_1, \cdots, x_k} p_1^{x_1} \cdots p_k^{x_k} \approx -\frac n2 \sum_{i=1}^k \frac{(\hat p_i - p_i)^2}{p_i} = -\frac 12 \sum_{i=1}^k \frac{(x_i - n p_i)^2}{n p_i}</math>This resembles the gaussian distribution, which suggests the following theorem: '''Theorem.''' At the <math>n \to \infty</math> limit, <math>n \sum_{i=1}^k \frac{(\hat p_i - p_i)^2}{p_i} = \sum_{i=1}^k \frac{(x_i - n p_i)^2}{n p_i}</math> [[converges in distribution]] to the [[chi-squared distribution]] <math>\chi^2(k-1)</math>. [[File:Convergence of multinomial distribution to the gaussian distribution.webm|thumb|339x339px|If we sample from the multinomial distribution <math>\mathrm{Multinomial}(n; 0.2, 0.3, 0.5)</math>, and plot the heatmap of the samples within the 2-dimensional simplex (here shown as a black triangle), we notice that as <math>n \to \infty</math>, the distribution converges to a gaussian around the point <math>(0.2, 0.3, 0.5)</math>, with the contours converging in shape to ellipses, with radii converging as <math>1/\sqrt n</math>. Meanwhile, the separation between the discrete points converge as <math>1/n</math>, and so the discrete multinomial distribution converges to a continuous gaussian distribution.]] {{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=[Proof]}} The space of all distributions over categories <math>\{1, 2, \ldots, k\}</math> is a [[simplex]]: <math>\Delta_{k} = \left\{(y_1, \ldots, y_k)\colon y_1, \ldots, y_k \geq 0, \sum_i y_i = 1\right\}</math>, and the set of all possible empirical distributions after <math>n</math> experiments is a subset of the simplex: <math>\Delta_{k, n} = \left\{(x_1/n, \ldots, x_k/n)\colon x_1, \ldots, x_k \in \N, \sum_i x_i = n\right\}</math>. That is, it is the intersection between <math>\Delta_k</math> and the lattice <math>(\Z^k)/n</math>. As <math>n</math> increases, most of the probability mass is concentrated in a subset of <math>\Delta_{k, n}</math> near <math>p</math>, and the probability distribution near <math>p</math> becomes well-approximated by <math display="block">\binom{n}{x_1, \cdots, x_k} p_1^{x_1} \cdots p_k^{x_k} \approx e^{-\frac n2 \sum_i \frac{(\hat p_i - p_i)^2}{p_i}}</math>From this, we see that the subset upon which the mass is concentrated has radius on the order of <math>1/\sqrt n</math>, but the points in the subset are separated by distance on the order of <math>1/n</math>, so at large <math>n</math>, the points merge into a continuum. To convert this from a discrete probability distribution to a continuous probability density, we need to multiply by the volume occupied by each point of <math>\Delta_{k, n}</math> in <math>\Delta_k</math>. However, by symmetry, every point occupies exactly the same volume (except a negligible set on the boundary), so we obtain a probability density <math>\rho(\hat p) = C e^{-\frac n2 \sum_i \frac{(\hat p_i - p_i)^2}{p_i}}</math>, where <math>C</math> is a constant. Finally, since the simplex <math>\Delta_k</math> is not all of <math>\R^k</math>, but only within a <math>(k-1)</math>-dimensional plane, we obtain the desired result. {{hidden end}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)