Radon–Nikodym theorem
Template:Short description In mathematics, the Radon–Nikodym theorem is a result in measure theory that expresses the relationship between two measures defined on the same measurable space. A measure is a set function that assigns a consistent magnitude to the measurable subsets of a measurable space. Examples of a measure include area and volume, where the subsets are sets of points; or the probability of an event, which is a subset of possible outcomes within a wider probability space.
One way to derive a new measure from one already given is to assign a density to each point of the space, then integrate over the measurable subset of interest. This can be expressed as
- <math>\nu(A) = \int_A f \, d\mu,</math>
where Template:Math is the new measure being defined for any measurable subset Template:Math and the function Template:Math is the density at a given point. The integral is with respect to an existing measure Template:Math, which may often be the canonical Lebesgue measure on the real line Template:Math or the n-dimensional Euclidean space Template:Math (corresponding to our standard notions of length, area and volume). For example, if Template:Math represented mass density and Template:Math was the Lebesgue measure in three-dimensional space Template:Math, then Template:Math would equal the total mass in a spatial region Template:Math.
The Radon–Nikodym theorem essentially states that, under certain conditions, any measure Template:Math can be expressed in this way with respect to another measure Template:Math on the same space. The function Template:Math is then called the Radon–Nikodym derivative and is denoted by <math>\tfrac{d\nu}{d\mu}</math>.<ref>Template:Cite book</ref> An important application is in probability theory, leading to the probability density function of a random variable.
The theorem is named after Johann Radon, who proved the theorem for the special case where the underlying space is Template:Math in 1913, and for Otto Nikodym who proved the general case in 1930.<ref>Template:Cite journal</ref> In 1936 Hans Freudenthal generalized the Radon–Nikodym theorem by proving the Freudenthal spectral theorem, a result in Riesz space theory; this contains the Radon–Nikodym theorem as a special case.<ref>Template:Cite book</ref>
A Banach space Template:Mvar is said to have the Radon–Nikodym property if the generalization of the Radon–Nikodym theorem also holds, mutatis mutandis, for functions with values in Template:Mvar. All Hilbert spaces have the Radon–Nikodym property.
Formal descriptionEdit
Radon–Nikodym theoremEdit
The Radon–Nikodym theorem involves a measurable space <math>(X, \Sigma)</math> on which two σ-finite measures are defined, <math>\mu</math> and <math>\nu.</math> It states that, if <math>\nu \ll \mu</math> (that is, if <math>\nu</math> is absolutely continuous with respect to <math>\mu</math>), then there exists a <math>\Sigma</math>-measurable function <math>f : X \to [0, \infty),</math> such that for any measurable set <math>A \in \Sigma,</math> <math display=block>\nu(A) = \int_A f \, d\mu.</math>
Radon–Nikodym derivativeEdit
The function <math>f</math> satisfying the above equality is Template:Em, that is, if <math>g</math> is another function which satisfies the same property, then <math>f = g</math> Template:Nowrap. The function <math>f</math> is commonly written <math display>\frac{d\nu}{d\mu}</math> and is called the Template:Visible anchor. The choice of notation and the name of the function reflects the fact that the function is analogous to a derivative in calculus in the sense that it describes the rate of change of density of one measure with respect to another (the way the Jacobian determinant is used in multivariable integration).
Extension to signed or complex measuresEdit
A similar theorem can be proven for signed and complex measures: namely, that if <math>\mu</math> is a nonnegative σ-finite measure, and <math>\nu</math> is a finite-valued signed or complex measure such that <math>\nu \ll \mu,</math> that is, <math>\nu</math> is absolutely continuous with respect to <math>\mu,</math> then there is a <math>\mu</math>-integrable real- or complex-valued function <math>g</math> on <math>X</math> such that for every measurable set <math>A,</math> <math display=block>\nu(A) = \int_A g \, d\mu.</math>
ExamplesEdit
In the following examples, the set Template:Mvar is the real interval [0,1], and <math>\Sigma</math> is the Borel sigma-algebra on Template:Mvar.
- <math>\mu</math> is the length measure on Template:Mvar. <math>\nu</math> assigns to each subset Template:Mvar of Template:Mvar, twice the length of Template:Mvar. Then, <math display="inline">\frac{d\nu}{d\mu} = 2</math>.
- <math>\mu</math> is the length measure on Template:Mvar. <math>\nu</math> assigns to each subset Template:Mvar of Template:Mvar, the number of points from the set {0.1, …, 0.9} that are contained in Template:Mvar. Then, <math>\nu</math> is not absolutely-continuous with respect to <math>\mu</math> since it assigns non-zero measure to zero-length points. Indeed, there is no derivative <math display="inline">\frac{d\nu}{d\mu}</math>: there is no finite function that, when integrated e.g. from <math>(0.1 - \varepsilon)</math> to <math>(0.1 + \varepsilon)</math>, gives <math>1</math> for all <math>\varepsilon > 0</math>.
- <math>\mu = \nu + \delta_0</math>, where <math>\nu</math> is the length measure on Template:Mvar and <math>\delta_0</math> is the Dirac measure on 0 (it assigns a measure of 1 to any set containing 0 and a measure of 0 to any other set). Then, <math>\nu</math> is absolutely continuous with respect to <math>\mu</math>, and <math display="inline">\frac{d\nu}{d\mu} = 1_{X\setminus \{0\}}</math> – the derivative is 0 at <math>x = 0</math> and 1 at <math>x > 0</math>.<ref>{{#invoke:citation/CS1|citation
|CitationClass=web }}</ref>
PropertiesEdit
- Let ν, μ, and λ be σ-finite measures on the same measurable space. If ν ≪ λ and μ ≪ λ (ν and μ are both absolutely continuous with respect to λ), then <math display="block"> \frac{d(\nu+\mu)}{d\lambda} = \frac{d\nu}{d\lambda}+\frac{d\mu}{d\lambda} \quad \lambda\text{-almost everywhere}.</math>
- If ν ≪ μ ≪ λ, then <math display="block">\frac{d\nu}{d\lambda}=\frac{d\nu}{d\mu}\frac{d\mu}{d\lambda}\quad\lambda\text{-almost everywhere}.</math>
- In particular, if μ ≪ ν and ν ≪ μ, then <math display="block"> \frac{d\mu}{d\nu}=\left(\frac{d\nu}{d\mu}\right)^{-1}\quad\nu\text{-almost everywhere}.</math>
- If μ ≪ λ and Template:Mvar is a μ-integrable function, then <math display="block"> \int_X g\,d\mu = \int_X g\frac{d\mu}{d\lambda}\,d\lambda.</math>
- If ν is a finite signed or complex measure, then <math display="block"> {d|\nu|\over d\mu} = \left|{d\nu\over d\mu}\right|. </math>
ApplicationsEdit
Probability theoryEdit
The theorem is very important in extending the ideas of probability theory from probability masses and probability densities defined over real numbers to probability measures defined over arbitrary sets. It tells if and how it is possible to change from one probability measure to another. Specifically, the probability density function of a random variable is the Radon–Nikodym derivative of the induced measure with respect to some base measure (usually the Lebesgue measure for continuous random variables).
For example, it can be used to prove the existence of conditional expectation for probability measures. The latter itself is a key concept in probability theory, as conditional probability is just a special case of it.
Financial mathematicsEdit
Amongst other fields, financial mathematics uses the theorem extensively, in particular via the Girsanov theorem. Such changes of probability measure are the cornerstone of the rational pricing of derivatives and are used for converting actual probabilities into those of the risk neutral probabilities.
Information divergencesEdit
If μ and ν are measures over Template:Mvar, and μ ≪ ν
- The Kullback–Leibler divergence from ν to μ is defined to be <math display="block"> D_\text{KL}(\mu \parallel \nu) = \int_X \log \left( \frac{d \mu}{d \nu} \right) \; d\mu.</math>
- For α > 0, α ≠ 1 the Rényi divergence of order α from ν to μ is defined to be <math display="block">D_\alpha(\mu \parallel \nu) = \frac{1}{\alpha - 1} \log\left(\int_X\left(\frac{d\mu}{d\nu}\right)^{\alpha-1}\; d\mu\right).</math>
The assumption of σ-finitenessEdit
The Radon–Nikodym theorem above makes the assumption that the measure μ with respect to which one computes the rate of change of ν is σ-finite.
Negative exampleEdit
Here is an example when μ is not σ-finite and the Radon–Nikodym theorem fails to hold.
Consider the Borel σ-algebra on the real line. Let the counting measure, Template:Mvar, of a Borel set Template:Mvar be defined as the number of elements of Template:Mvar if Template:Mvar is finite, and Template:Math otherwise. One can check that Template:Mvar is indeed a measure. It is not Template:Mvar-finite, as not every Borel set is at most a countable union of finite sets. Let Template:Mvar be the usual Lebesgue measure on this Borel algebra. Then, Template:Mvar is absolutely continuous with respect to Template:Mvar, since for a set Template:Mvar one has Template:Math only if Template:Mvar is the empty set, and then Template:Math is also zero.
Assume that the Radon–Nikodym theorem holds, that is, for some measurable function Template:Math one has
- <math>\nu(A) = \int_A f \,d\mu</math>
for all Borel sets. Taking Template:Mvar to be a singleton set, Template:Math, and using the above equality, one finds
- <math> 0 = f(a)</math>
for all real numbers Template:Mvar. This implies that the function Template:Math, and therefore the Lebesgue measure Template:Mvar, is zero, which is a contradiction.
Positive resultEdit
Assuming <math>\nu\ll\mu,</math> the Radon–Nikodym theorem also holds if <math>\mu</math> is localizable and <math>\nu</math> is accessible with respect to <math>\mu</math>,<ref name=BP>Template:Cite book</ref>Template:Rp i.e., <math>\nu(A)=\sup\{\nu(B):B\in{\cal P}(A)\cap\mu^\operatorname{pre}(\R_{\ge0})\}</math> for all <math>A\in\Sigma.</math><ref>Template:Cite book</ref>Template:Rp<ref name=BP/>Template:Rp
ProofEdit
This section gives a measure-theoretic proof of the theorem. There is also a functional-analytic proof, using Hilbert space methods, that was first given by von Neumann.
For finite measures Template:Mvar and Template:Mvar, the idea is to consider functions Template:Math with Template:Math. The supremum of all such functions, along with the monotone convergence theorem, then furnishes the Radon–Nikodym derivative. The fact that the remaining part of Template:Mvar is singular with respect to Template:Mvar follows from a technical fact about finite measures. Once the result is established for finite measures, extending to Template:Mvar-finite, signed, and complex measures can be done naturally. The details are given below.
For finite measuresEdit
Constructing an extended-valued candidate First, suppose Template:Mvar and Template:Mvar are both finite-valued nonnegative measures. Let Template:Mvar be the set of those extended-value measurable functions Template:Math such that:
- <math>\forall A \in \Sigma:\qquad \int_A f\,d\mu \leq \nu(A)</math>
Template:Math, since it contains at least the zero function. Now let Template:Math, and suppose Template:Mvar is an arbitrary measurable set, and define:
- <math>\begin{align}
A_1 &= \left\{ x \in A : f_1(x) > f_2(x) \right\}, \\ A_2 &= \left\{ x \in A : f_2(x) \geq f_1(x) \right\}.
\end{align}</math>
Then one has
- <math>\int_A\max\left\{f_1, f_2\right\}\,d\mu = \int_{A_1} f_1\,d\mu + \int_{A_2} f_2\,d\mu \leq \nu\left(A_1\right) + \nu\left(A_2\right) = \nu(A),</math>
and therefore, Template:Math.
Now, let Template:Math be a sequence of functions in Template:Mvar such that
- <math>\lim_{n\to\infty}\int_X f_n\,d\mu = \sup_{f\in F} \int_X f\,d\mu.</math>
By replacing Template:Math with the maximum of the first Template:Mvar functions, one can assume that the sequence Template:Math is increasing. Let Template:Mvar be an extended-valued function defined as
- <math>g(x) := \lim_{n\to\infty}f_n(x).</math>
By Lebesgue's monotone convergence theorem, one has
- <math>\lim_{n\to\infty} \int_A f_n\,d\mu = \int_A \lim_{n\to\infty} f_n(x)\,d\mu(x) = \int_A g\,d\mu \leq \nu(A)</math>
for each Template:Math, and hence, Template:Math. Also, by the construction of Template:Mvar,
- <math>\int_X g\,d\mu = \sup_{f\in F}\int_X f\,d\mu.</math>
Proving equality Now, since Template:Math,
- <math>\nu_0(A) := \nu(A) - \int_A g\,d\mu</math>
defines a nonnegative measure on Template:Math. To prove equality, we show that Template:Math.
Suppose Template:Math; then, since Template:Mvar is finite, there is an Template:Math such that Template:Math. To derive a contradiction from Template:Math, we look for a positive set Template:Math for the signed measure Template:Math (i.e. a measurable set Template:Mvar, all of whose measurable subsets have non-negative Template:Math measure), where also Template:Mvar has positive Template:Mvar-measure. Conceptually, we're looking for a set Template:Mvar, where Template:Math in every part of Template:Mvar. A convenient approach is to use the Hahn decomposition Template:Math for the signed measure Template:Math.
Note then that for every Template:Math one has Template:Math, and hence,
- <math>\begin{align}
\nu(A) &= \int_A g\,d\mu + \nu_0(A) \\ &\geq \int_A g\,d\mu + \nu_0(A\cap P)\\ &\geq \int_A g\,d\mu + \varepsilon\mu(A\cap P) = \int_A\left(g + \varepsilon 1_P\right)\,d\mu,
\end{align}</math>
where Template:Math is the indicator function of Template:Mvar. Also, note that Template:Math as desired; for if Template:Math, then (since Template:Mvar is absolutely continuous in relation to Template:Mvar) Template:Math, so Template:Math and
- <math>\nu_0(X) - \varepsilon\mu(X) = \left(\nu_0 - \varepsilon\mu\right)(N) \leq 0,</math>
contradicting the fact that Template:Math.
Then, since also
- <math>\int_X\left(g + \varepsilon1_P\right)\,d\mu \leq \nu(X) < +\infty,</math>
Template:Math and satisfies
- <math>\int_X\left(g + \varepsilon 1_P\right)\,d\mu > \int_X g\,d\mu = \sup_{f\in F}\int_X f\,d\mu.</math>
This is impossible because it violates the definition of a supremum; therefore, the initial assumption that Template:Math must be false. Hence, Template:Math, as desired.
Restricting to finite values Now, since Template:Mvar is Template:Mvar-integrable, the set Template:Math is Template:Mvar-null. Therefore, if a Template:Math is defined as
- <math>f(x) = \begin{cases}
g(x) & \text{if }g(x) < \infty \\ 0 & \text{otherwise,}
\end{cases}</math>
then Template:Math has the desired properties.
Uniqueness As for the uniqueness, let Template:Math be measurable functions satisfying
- <math>\nu(A) = \int_A f\,d\mu = \int_A g\,d\mu</math>
for every measurable set Template:Mvar. Then, Template:Math is Template:Mvar-integrable, and
- <math>\int_A(g - f)\,d\mu = 0.</math> (Recall that we can split the integral into two as long as they are measurable and non-negative)
In particular, for Template:Math or Template:Math. It follows that
- <math>\int_X(g - f)^+\,d\mu = 0 = \int_X(g - f)^-\,d\mu,</math>
and so, that Template:Math Template:Mvar-almost everywhere; the same is true for Template:Math, and thus, Template:Math Template:Mvar-almost everywhere, as desired.
For Template:Mvar-finite positive measuresEdit
If Template:Mvar and Template:Mvar are Template:Mvar-finite, then Template:Mvar can be written as the union of a sequence Template:Math of disjoint sets in Template:Math, each of which has finite measure under both Template:Mvar and Template:Mvar. For each Template:Mvar, by the finite case, there is a Template:Math-measurable function Template:Math such that
- <math>\nu_n(A) = \int_A f_n\,d\mu</math>
for each Template:Math-measurable subset Template:Mvar of Template:Math. The sum <math display="inline">\left(\sum_n f_n 1_{B_n}\right) := f</math> of those functions is then the required function such that <math display="inline">\nu(A) = \int_A f \, d\mu</math>.
As for the uniqueness, since each of the Template:Math is Template:Mvar-almost everywhere unique, so is Template:Math.
For signed and complex measuresEdit
If Template:Mvar is a Template:Mvar-finite signed measure, then it can be Hahn–Jordan decomposed as Template:Math where one of the measures is finite. Applying the previous result to those two measures, one obtains two functions, Template:Math, satisfying the Radon–Nikodym theorem for Template:Math and Template:Math respectively, at least one of which is Template:Mvar-integrable (i.e., its integral with respect to Template:Mvar is finite). It is clear then that Template:Math satisfies the required properties, including uniqueness, since both Template:Mvar and Template:Mvar are unique up to Template:Mvar-almost everywhere equality.
If Template:Mvar is a complex measure, it can be decomposed as Template:Math, where both Template:Math and Template:Math are finite-valued signed measures. Applying the above argument, one obtains two functions, Template:Math, satisfying the required properties for Template:Math and Template:Math, respectively. Clearly, Template:Math is the required function.
The Lebesgue decomposition theoremEdit
Lebesgue's decomposition theorem shows that the assumptions of the Radon–Nikodym theorem can be found even in a situation which is seemingly more general. Consider a σ-finite positive measure <math>\mu</math> on the measure space <math>(X,\Sigma)</math> and a σ-finite signed measure <math>\nu</math> on <math>\Sigma</math>, without assuming any absolute continuity. Then there exist unique signed measures <math>\nu_a</math> and <math>\nu_s</math> on <math>\Sigma</math> such that <math>\nu=\nu_a+\nu_s</math>, <math>\nu_a\ll\mu</math>, and <math>\nu_s\perp\mu</math>. The Radon–Nikodym theorem can then be applied to the pair <math>\nu_a,\mu</math>.
See alsoEdit
NotesEdit
ReferencesEdit
- Template:Cite book Contains a proof for vector measures assuming values in a Banach space.
- Template:Cite book Contains a lucid proof in case the measure ν is not σ-finite.
- Template:Cite book
- Template:Cite book Contains a proof of the generalisation.
- {{#invoke:citation/CS1|citation
|CitationClass=web }}
{{#if: | This article incorporates material from the following PlanetMath articles, which are licensed under the Creative Commons Attribution/Share-Alike License: {{#if: | Radon–Nikodym theorem | {{#if: 3998 | Radon–Nikodym theorem | [{{{sourceurl}}} Radon–Nikodym theorem] }} }}, {{#if: | {{{title2}}} | {{#if: | {{{title2}}} | [{{{sourceurl2}}} {{{title2}}}] }} }}{{#if: | , {{#if: | {{{title3}}} | {{#if: | {{{title3}}} | [{{{sourceurl3}}} {{{title3}}}] }} }} }}{{#if: | , {{#if: | {{{title4}}} | {{#if: | {{{title4}}} | [{{{sourceurl4}}} {{{title4}}}] }} }} }}{{#if: | , {{#if: | {{{title5}}} | {{#if: | {{{title5}}} | [{{{sourceurl5}}} {{{title5}}}] }} }} }}{{#if: | , {{#if: | {{{title6}}} | {{#if: | {{{title6}}} | [{{{sourceurl6}}} {{{title6}}}] }} }} }}{{#if: | , {{#if: | {{{title7}}} | {{#if: | {{{title7}}} | [{{{sourceurl7}}} {{{title7}}}] }} }} }}{{#if: | , {{#if: | {{{title8}}} | {{#if: | {{{title8}}} | [{{{sourceurl8}}} {{{title8}}}] }} }} }}{{#if: | , {{#if: | {{{title9}}} | {{#if: | {{{title9}}} | [{{{sourceurl9}}} {{{title9}}}] }} }} }}. | This article incorporates material from {{#if: | Radon–Nikodym theorem | Radon–Nikodym theorem}} on PlanetMath, which is licensed under the Creative Commons Attribution/Share-Alike License. }}