Wald's equation
Template:Short description In probability theory, Wald's equation, Wald's identity<ref>Template:Cite book</ref> or Wald's lemma<ref>Template:Cite journal</ref> is an important identity that simplifies the calculation of the expected value of the sum of a random number of random quantities. In its simplest form, it relates the expectation of a sum of randomly many finite-mean, independent and identically distributed random variables to the expected number of terms in the sum and the random variables' common expectation under the condition that the number of terms in the sum is independent of the summands.
The equation is named after the mathematician Abraham Wald. An identity for the second moment is given by the Blackwell–Girshick equation.<ref>Template:Cite journal</ref>
Basic versionEdit
Let Template:Math be a sequence of real-valued, independent and identically distributed random variables and let Template:Math be an integer-valued random variable that is independent of the sequence Template:Math. Suppose that Template:Math and the Template:Math have finite expectations. Then
- <math>\operatorname{E}[X_1+\dots+X_N]=\operatorname{E}[N] \operatorname{E}[X_1]\,.</math>
ExampleEdit
Roll a six-sided dice. Take the number on the die (call it Template:Math) and roll that number of six-sided dice to get the numbers Template:Math, and add up their values. By Wald's equation, the resulting value on average is
- <math>\operatorname{E}[N] \operatorname{E}[X] = \frac{1+2+3+4+5+6}6\cdot\frac{1+2+3+4+5+6}6 = \frac{441}{36} = \frac{49}{4} = 12.25\,.</math>
General versionEdit
Let Template:Math be an infinite sequence of real-valued random variables and let Template:Math be a nonnegative integer-valued random variable.
Assume that:
- Template:EquationRef. Template:Math are all integrable (finite-mean) random variables,
- Template:EquationRef. Template:Math for every natural number Template:Math, and
- Template:EquationRef. the infinite series satisfies
- <math>\sum_{n=1}^\infty\operatorname{E}\!\bigl[|X_n| 1_{\{N\ge n\}}\bigr]<\infty.</math>
Then the random sums
- <math>S_N:=\sum_{n=1}^NX_n,\qquad T_N:=\sum_{n=1}^N\operatorname{E}[X_n]</math>
are integrable and
- <math>\operatorname{E}[S_N]=\operatorname{E}[T_N].</math>
If, in addition,
- Template:EquationRef. Template:Math all have the same expectation, and
- Template:EquationRef. Template:Math has finite expectation,
then
- <math>\operatorname{E}[S_N]=\operatorname{E}[N]\, \operatorname{E}[X_1].</math>
Remark: Usually, the name Wald's equation refers to this last equality.
Discussion of assumptionsEdit
Clearly, assumption (Template:EquationNote) is needed to formulate assumption (Template:EquationNote) and Wald's equation. Assumption (Template:EquationNote) controls the amount of dependence allowed between the sequence Template:Math and the number Template:Math of terms; see the counterexample below for the necessity. Note that assumption (Template:EquationNote) is satisfied when Template:Math is a stopping time for a sequence of independent random variables Template:Math.Template:Citation needed Assumption (Template:EquationNote) is of more technical nature, implying absolute convergence and therefore allowing arbitrary rearrangement of an infinite series in the proof.
If assumption (Template:EquationNote) is satisfied, then assumption (Template:EquationNote) can be strengthened to the simpler condition
- Template:EquationRef. there exists a real constant Template:Math such that Template:Math for all natural numbers Template:Math.
Indeed, using assumption (Template:EquationNote),
- <math>\sum_{n=1}^\infty\operatorname{E}\!\bigl[|X_n|1_{\{N\ge n\}}\bigr]\le
C\sum_{n=1}^\infty\operatorname{P}(N\ge n),</math>
and the last series equals the expectation of Template:Math [Proof], which is finite by assumption (Template:EquationNote). Therefore, (Template:EquationNote) and (Template:EquationNote) imply assumption (Template:EquationNote).
Assume in addition to (Template:EquationNote) and (Template:EquationNote) that
- Template:EquationRef. Template:Math is independent of the sequence Template:Math and
- Template:EquationRef. there exists a constant Template:Math such that Template:Math for all natural numbers Template:Math.
Then all the assumptions (Template:EquationNote), (Template:EquationNote), (Template:EquationNote) and (Template:EquationNote), hence also (Template:EquationNote) are satisfied. In particular, the conditions (Template:EquationNote) and (Template:EquationNote) are satisfied if
- Template:EquationRef. the random variables Template:Math all have the same distribution.
Note that the random variables of the sequence Template:Math don't need to be independent.
The interesting point is to admit some dependence between the random number Template:Math of terms and the sequence Template:Math. A standard version is to assume (Template:EquationNote), (Template:EquationNote), (Template:EquationNote) and the existence of a filtration Template:Math such that
- Template:EquationRef. Template:Math is a stopping time with respect to the filtration, and
- Template:EquationRef. Template:Math and Template:Math are independent for every Template:Math.
Then (Template:EquationNote) implies that the event Template:Math is in Template:Math, hence by (Template:EquationNote) independent of Template:Math. This implies (Template:EquationNote), and together with (Template:EquationNote) it implies (Template:EquationNote).
For convenience (see the proof below using the optional stopping theorem) and to specify the relation of the sequence Template:Math and the filtration Template:Math, the following additional assumption is often imposed:
- Template:EquationRef. the sequence Template:Math is adapted to the filtration Template:Math, meaning the Template:Math is Template:Math-measurable for every Template:Math.
Note that (Template:EquationNote) and (Template:EquationNote) together imply that the random variables Template:Math are independent.
ApplicationEdit
An application is in actuarial science when considering the total claim amount follows a compound Poisson process
- <math>S_N=\sum_{n=1}^NX_n</math>
within a certain time period, say one year, arising from a random number Template:Math of individual insurance claims, whose sizes are described by the random variables Template:Math. Under the above assumptions, Wald's equation can be used to calculate the expected total claim amount when information about the average claim number per year and the average claim size is available. Under stronger assumptions and with more information about the underlying distributions, Panjer's recursion can be used to calculate the distribution of Template:Math.
ExamplesEdit
Example with dependent termsEdit
Let Template:Math be an integrable, Template:Math-valued random variable, which is independent of the integrable, real-valued random variable Template:Math with Template:Math. Define Template:Math for all Template:Math. Then assumptions (Template:EquationNote), (Template:EquationNote), (Template:EquationNote), and (Template:EquationNote) with Template:Math are satisfied, hence also (Template:EquationNote) and (Template:EquationNote), and Wald's equation applies. If the distribution of Template:Math is not symmetric, then (Template:EquationNote) does not hold. Note that, when Template:Math is not almost surely equal to the zero random variable, then (Template:EquationNote) and (Template:EquationNote) cannot hold simultaneously for any filtration Template:Math, because Template:Math cannot be independent of itself as Template:Math is impossible.
Example where the number of terms depends on the sequenceEdit
Let Template:Math be a sequence of independent, symmetric, and Template:Math}-valued random variables. For every Template:Math let Template:Math be the σ-algebra generated by Template:Math and define Template:Math when Template:Math is the first random variable taking the value Template:Math. Note that Template:Math, hence Template:Math by the ratio test. The assumptions (Template:EquationNote), (Template:EquationNote) and (Template:EquationNote), hence (Template:EquationNote) and (Template:EquationNote) with Template:Math, (Template:EquationNote), (Template:EquationNote), and (Template:EquationNote) hold, hence also (Template:EquationNote), and (Template:EquationNote) and Wald's equation applies. However, (Template:EquationNote) does not hold, because Template:Math is defined in terms of the sequence Template:Math. Intuitively, one might expect to have Template:Math in this example, because the summation stops right after a one, thereby apparently creating a positive bias. However, Wald's equation shows that this intuition is misleading.
CounterexamplesEdit
A counterexample illustrating the necessity of assumption (Template:EquationNote)Edit
Consider a sequence Template:Math of i.i.d. (Independent and identically distributed random variables) random variables, taking each of the two values 0 and 1 with probability Template:Sfrac (actually, only Template:Math is needed in the following). Define Template:Math. Then Template:Math is identically equal to zero, hence Template:Math, but Template:Math and Template:Math and therefore Wald's equation does not hold. Indeed, the assumptions (Template:EquationNote), (Template:EquationNote), (Template:EquationNote) and (Template:EquationNote) are satisfied, however, the equation in assumption (Template:EquationNote) holds for all Template:Math except for Template:Math.Template:Citation needed
A counterexample illustrating the necessity of assumption (Template:EquationNote)Edit
Very similar to the second example above, let Template:Math be a sequence of independent, symmetric random variables, where Template:Math takes each of the values Template:Math and Template:Math with probability Template:Sfrac. Let Template:Math be the first Template:Math such that Template:Math. Then, as above, Template:Math has finite expectation, hence assumption (Template:EquationNote) holds. Since Template:Math for all Template:Math, assumptions (Template:EquationNote) and (Template:EquationNote) hold. However, since Template:Math almost surely, Wald's equation cannot hold.
Since Template:Math is a stopping time with respect to the filtration generated by Template:Math, assumption (Template:EquationNote) holds, see above. Therefore, only assumption (Template:EquationNote) can fail, and indeed, since
- <math>\{N\ge n\}=\{X_i=-2^{i} \text{ for } i=1,\ldots,n-1\}</math>
and therefore Template:Math for every Template:Math, it follows that
- <math>\sum_{n=1}^\infty\operatorname{E}\!\bigl[|X_n|1_{\{N\ge n\}}\bigr]
=\sum_{n=1}^\infty 2^n\,\operatorname{P}(N\ge n) =\sum_{n=1}^\infty 2=\infty.</math>
A proof using the optional stopping theoremEdit
Assume (Template:EquationNote), (Template:EquationNote), (Template:EquationNote), (Template:EquationNote), (Template:EquationNote) and (Template:EquationNote). Using assumption (Template:EquationNote), define the sequence of random variables
- <math>M_n = \sum_{i=1}^n (X_i - \operatorname{E}[X_i]),\quad n\in{\mathbb N}_0.</math>
Assumption (Template:EquationNote) implies that the conditional expectation of Template:Math given Template:Math equals Template:Math almost surely for every Template:Math, hence Template:Math is a martingale with respect to the filtration Template:Math by assumption (Template:EquationNote). Assumptions (Template:EquationNote), (Template:EquationNote) and (Template:EquationNote) make sure that we can apply the optional stopping theorem, hence Template:Math is integrable and Template:NumBlk
Due to assumption (Template:EquationNote),
- <math>|T_N|=\biggl|\sum_{i=1}^N\operatorname{E}[X_i]\biggr| \le \sum_{i=1}^N\operatorname{E}[|X_i|]\le CN,</math>
and due to assumption (Template:EquationNote) this upper bound is integrable. Hence we can add the expectation of Template:Math to both sides of Equation (Template:EquationNote) and obtain by linearity
- <math>\operatorname{E}[S_N]
=\operatorname{E}[T_N].</math>
Remark: Note that this proof does not cover the above example with dependent terms.
General proofEdit
This proof uses only Lebesgue's monotone and dominated convergence theorems. We prove the statement as given above in three steps.
Step 1: Integrability of the random sum Template:MathEdit
We first show that the random sum Template:Math is integrable. Define the partial sums
Since Template:Math takes its values in Template:Math and since Template:Math, it follows that
- <math>|S_N|=\sum_{i=1}^\infty|S_i|\,1_{\{N=i\}}.</math>
The Lebesgue monotone convergence theorem implies that
- <math>\operatorname{E}[|S_N|]=\sum_{i=1}^\infty\operatorname{E}[|S_i|\,1_{\{N=i\}}].</math>
By the triangle inequality,
- <math>|S_i|\le\sum_{n=1}^i|X_n|,\quad i\in{\mathbb N}.</math>
Using this upper estimate and changing the order of summation (which is permitted because all terms are non-negative), we obtain
Template:NumBlk]=\sum_{n=1}^\infty\operatorname{E}[|X_n|\,1_{\{N\ge n\}}],</math>|Template:EquationRef}}
where the second inequality follows using the monotone convergence theorem. By assumption (Template:EquationNote), the infinite sequence on the right-hand side of (Template:EquationNote) converges, hence Template:Math is integrable.
Step 2: Integrability of the random sum Template:MathEdit
We now show that the random sum Template:Math is integrable. Define the partial sums
of real numbers. Since Template:Math takes its values in Template:Math and since Template:Math, it follows that
- <math>|T_N|=\sum_{i=1}^\infty|T_i|\,1_{\{N=i\}}.</math>
As in step 1, the Lebesgue monotone convergence theorem implies that
- <math>\operatorname{E}[|T_N|]=\sum_{i=1}^\infty |T_i|\operatorname{P}(N=i).</math>
By the triangle inequality,
- <math>|T_i|\le\sum_{n=1}^i\bigl|\!\operatorname{E}[X_n]\bigr|,\quad i\in{\mathbb N}.</math>
Using this upper estimate and changing the order of summation (which is permitted because all terms are non-negative), we obtain
By assumption (Template:EquationNote),
- <math>\bigl|\!\operatorname{E}[X_n]\bigr|\operatorname{P}(N\ge n)
=\bigl|\!\operatorname{E}[X_n1_{\{N\ge n\}}]\bigr| \le\operatorname{E}[|X_n|1_{\{N\ge n\}}],\quad n\in{\mathbb N}.</math>
Substituting this into (Template:EquationNote) yields
- <math>\operatorname{E}[|T_N|]\le\sum_{n=1}^\infty\operatorname{E}[|X_n|1_{\{N\ge n\}}],</math>
which is finite by assumption (Template:EquationNote), hence Template:Math is integrable.
Step 3: Proof of the identityEdit
To prove Wald's equation, we essentially go through the same steps again without the absolute value, making use of the integrability of the random sums Template:Math and Template:Math in order to show that they have the same expectation.
Using the dominated convergence theorem with dominating random variable Template:Math and the definition of the partial sum Template:Math given in (Template:EquationNote), it follows that
- <math>\operatorname{E}[S_N]=\sum_{i=1}^\infty\operatorname{E}[S_i1_{\{N=i\}}]
=\sum_{i=1}^\infty\sum_{n=1}^i\operatorname{E}[X_n1_{\{N=i\}}].</math>
Due to the absolute convergence proved in (Template:EquationNote) above using assumption (Template:EquationNote), we may rearrange the summation and obtain that
- <math>\operatorname{E}[S_N]=\sum_{n=1}^\infty\sum_{i=n}^\infty\operatorname{E}[X_n1_{\{N=i\}}]=\sum_{n=1}^\infty\operatorname{E}[X_n1_{\{N\ge n\}}],</math>
where we used assumption (Template:EquationNote) and the dominated convergence theorem with dominating random variable Template:Math for the second equality. Due to assumption (Template:EquationNote) and the σ-additivity of the probability measure,
- <math>\begin{align}\operatorname{E}[X_n1_{\{N\ge n\}}] &=\operatorname{E}[X_n]\operatorname{P}(N\ge n)\\
&=\operatorname{E}[X_n]\sum_{i=n}^\infty\operatorname{P}(N=i) =\sum_{i=n}^\infty\operatorname{E}\!\bigl[\operatorname{E}[X_n]1_{\{N=i\}}\bigr].\end{align}</math>
Substituting this result into the previous equation, rearranging the summation (which is permitted due to absolute convergence, see (Template:EquationNote) above), using linearity of expectation and the definition of the partial sum Template:Math of expectations given in (Template:EquationNote),
- <math>\operatorname{E}[S_N]=\sum_{i=1}^\infty\sum_{n=1}^i\operatorname{E}\!\bigl[\operatorname{E}[X_n]1_{\{N=i\}}\bigr]=\sum_{i=1}^\infty\operatorname{E}[\underbrace{T_i1_{\{N=i\}}}_{=\,T_N1_{\{N=i\}}}].</math>
By using dominated convergence again with dominating random variable Template:Math,
- <math>\operatorname{E}[S_N]=\operatorname{E}\!\biggl[T_N\underbrace{\sum_{i=1}^\infty1_{\{N=i\}}}_{=\,1_{\{N\ge1\}}}\biggr]=\operatorname{E}[T_N].</math>
If assumptions (Template:EquationNote) and (Template:EquationNote) are satisfied, then by linearity of expectation,
- <math>\operatorname{E}[T_N]=\operatorname{E}\!\biggl[\sum_{n=1}^N \operatorname{E}[X_n]\biggr]=\operatorname{E}[X_1]\operatorname{E}\!\biggl[\underbrace{\sum_{n=1}^N 1}_{=\,N}\biggr]=\operatorname{E}[N]\operatorname{E}[X_1].</math>
This completes the proof.
Further generalizationsEdit
- Wald's equation can be transferred to Template:Math-valued random variables Template:Math by applying the one-dimensional version to every component.
- If Template:Math are Bochner-integrable random variables taking values in a Banach space, then the general proof above can be adjusted accordingly.