Maxwell's theorem

Revision as of 16:50, 1 June 2025 by 50.47.149.62 (talk) (Bryc cites M. S. Bartlett (1934) "for one of the early proofs".)
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Template:Short description {{#invoke:Hatnote|hatnote}} In probability theory, Maxwell's theorem (known also as Herschel-Maxwell's theorem and Herschel-Maxwell's derivation) states that if the probability distribution of a random vector in <math>\R^n</math> is unchanged by rotations, and if the components are independent, then the components are identically distributed and normally distributed.

Equivalent statementsEdit

If the probability distribution of a vector-valued random variable X = ( X1, ..., Xn )T is the same as the distribution of GX for every n×n orthogonal matrix G and the components are independent, then the components X1, ..., Xn are normally distributed with expected value 0 and all have the same variance. This theorem is one of many characterizations of the normal distribution.

The only rotationally invariant probability distributions on Rn that have independent components are multivariate normal distributions with expected value 0 and variance σ2In, (where In = the n×n identity matrix), for some positive number σ2.

HistoryEdit

John Herschel proved the theorem in 1850.<ref>Herschel, J. F. W. (1850). Quetelet on probabilities. Edinburgh Rev., 92, 1–57.</ref><ref>Template:Harvtxt quotes Herschel and "state[s] the Herschel-Maxwell theorem in modern notation but without proof". Bryc cites M. S. Bartlett (1934) "for one of the early proofs".</ref> Ten years later, James Clerk Maxwell proved the theorem in Proposition IV of his 1860 paper.<ref>See:

ProofEdit

We only need to prove the theorem for the 2-dimensional case, since we can then generalize it to n-dimensions by applying the theorem sequentially to each pair of coordinates.

Since rotating by 90 degrees preserves the joint distribution, <math>X_1</math> and <math>X_2</math> have the same probability measure: let it be <math>\mu</math>. If <math>\mu</math> is a Dirac delta distribution at zero, then it is in particular a degenerate gaussian distribution. Let us now assume that it is not a Dirac delta distribution at zero.

By the Lebesgue's decomposition theorem, we decompose <math>\mu</math> to a sum of regular measure and an atomic measure: <math>\mu = \mu_r + \mu_s</math>. We need to show that <math>\mu_s = 0</math>; we proceed by contradiction. Suppose <math>\mu_s</math> contains an atomic part, then there exists some <math>x\in \R</math> such that <math>\mu_s(\{x\}) > 0</math>. By independence of <math>X_1, X_2</math>, the conditional variable <math>X_2 | \{X_1 = x\}</math> is distributed the same way as <math>X_2</math>. Suppose <math>x=0</math>, then since we assumed <math>\mu</math> is not concentrated at zero, <math>Pr(X_2 \neq 0) > 0</math>, and so the double ray <math>\{(x_1, x_2): x_1 = 0, x_2 \neq 0\}</math> has nonzero probability. Now, by rotational symmetry of <math>\mu \times \mu</math>, any rotation of the double ray also has the same nonzero probability, and since any two rotations are disjoint, their union has infinite probability; thus arriving at a contradiction.

Let <math>\mu </math> have probability density function <math>\rho</math>; the problem reduces to solving the functional equation

<math display="block">\rho(x)\rho(y) = \rho(x \cos \theta + y \sin\theta)\rho(x \sin \theta - y \cos\theta).</math>

ReferencesEdit

Template:Reflist

SourcesEdit

External linksEdit