Editing Weighted arithmetic mean (section)

==== Variance of the weighted sum (''pwr''-estimator for totals) ====

If the population size ''N'' is known we can estimate the population mean using <math>\hat{\bar Y}_{\text{known } N} = \frac{\hat Y_{pwr}}{N} \approx \frac{\sum_{i=1}^n w_i y'_i}{N} </math>.

If the [[sampling design]] is one that results in a fixed sample size ''n'' (such as in [[Probability-proportional-to-size sampling|pps sampling]]), then the variance of this estimator is:

: <math> \operatorname{Var} \left( \hat{\bar Y}_{\text{known }N} \right) =  \frac{1}{N^2} \frac{n}{n-1} \sum_{i=1}^n \left( w_i y_i - \overline{wy} \right)^2 </math>

{{math proof|proof=
The general formula can be developed like this:

: <math>\hat{\bar Y}_{\text{known } N} = \frac{\hat Y_{pwr}}{N} = \frac{\frac{1}{n} \sum_{i=1}^n \frac{y'_i}{p_i} }{N} \approx \frac{\sum_{i=1}^n \frac{y'_i}{\pi_i}}{N} = \frac{\sum_{i=1}^n w_i y'_i}{N}. </math>

The population total is denoted as <math>Y = \sum_{i=1}^N y_i</math> and it may be estimated by the (unbiased) [[Horvitz–Thompson estimator]], also called the ''<math>\pi</math>''-estimator. This estimator can be itself estimated using the ''pwr''-estimator (i.e.: <math>p</math>-expanded with replacement estimator, or "probability with replacement" estimator). With the above notation, it is: <math>\hat Y_{pwr} = \frac{1}{n} \sum_{i=1}^n \frac{y'_i}{p_i} = \sum_{i=1}^n \frac{y'_i}{n p_i} \approx \sum_{i=1}^n \frac{y'_i}{\pi_i} = \sum_{i=1}^n w_i y'_i</math>.<ref name = "sarndal1992" />{{rp|51}}

The estimated variance of the ''pwr''-estimator is given by:<ref name = "sarndal1992" />{{rp|52}}
<math display="block">\operatorname{Var}(\hat Y_{pwr}) = \frac{n}{n-1} \sum_{i=1}^n \left( w_i y_i - \overline{wy} \right)^2 </math>
where <math>\overline{wy} = \sum_{i=1}^n \frac{w_i y_i}{n} </math>.

The above formula was taken from Sarndal et al. (1992) (also presented in Cochran 1977), but was written differently.<ref name = "sarndal1992" />{{rp|52}}<ref name = "Cochran1977" />{{rp|307 (11.35)}} The left side is how the variance was written and the right side is how we've developed the weighted version:

<math display="block">\begin{align}
\operatorname{Var}(\hat Y_\text{pwr}) & = \frac{1}{n} \frac{1}{n-1} \sum_{i=1}^n \left( \frac{y_i}{p_i} - \hat Y_{pwr} \right)^2 \\
& = \frac{1}{n} \frac{1}{n-1} \sum_{i=1}^n \left( \frac{n}{n} \frac{y_i}{p_i} - \frac{n}{n} \sum_{i=1}^n w_i y_i \right)^2
  = \frac{1}{n} \frac{1}{n-1} \sum_{i=1}^n \left( n \frac{y_i}{\pi_i} -  n \frac{\sum_{i=1}^n w_i y_i}{n} \right)^2 \\
& = \frac{n^2}{n} \frac{1}{n-1} \sum_{i=1}^n \left( w_i y_i - \overline{wy} \right)^2 \\
& = \frac{n}{n-1} \sum_{i=1}^n \left( w_i y_i - \overline{wy} \right)^2
\end{align}</math>

And we got to the formula from above.
}}

An alternative term, for when the sampling has a random sample size (as in [[Poisson sampling]]), is presented in Sarndal et al. (1992) as:<ref name = "sarndal1992" />{{rp|182}}

<math display="block">\operatorname{Var}(\hat \bar Y_{\text{pwr (known }N\text{)}}) = \frac{1}{N^2} \sum_{i=1}^n \sum_{j=1}^n \left( \check{\Delta}_{ij} \check{y}_i \check{y}_j \right) </math>

With <math>\check{y}_i = \frac{y_i}{\pi_i}</math>. Also, <math>C(I_i, I_j) = \pi_{ij} - \pi_{i}\pi_{j} = \Delta_{ij} </math> where <math>\pi_{ij}</math> is the probability of selecting both i and j.<ref name = "sarndal1992" />{{rp|36}} And <math>\check{\Delta}_{ij} = 1 - \frac{\pi_{i}\pi_{j}}{\pi_{ij}}</math>, and for i=j: <math>\check{\Delta}_{ii} = 1 - \frac{\pi_{i}\pi_{i}}{\pi_{i}} = 1- \pi_{i}</math>.<ref name = "sarndal1992" />{{rp|43}}

If the selection probability are uncorrelated (i.e.: <math>\forall i \neq j: C(I_i, I_j) = 0</math>), and when assuming the probability of each element is very small, then:

: <math>\operatorname{Var}(\hat \bar Y_{\text{pwr (known }N\text{)}}) = \frac{1}{N^2} \sum_{i=1}^n \left( w_i y_i \right)^2 </math>

{{math proof|proof=
We assume that <math>(1- \pi_i) \approx 1</math> and that
<math display="block">\begin{align}
\operatorname{Var}(\hat Y_{\text{pwr (known } N\text{)}}) & = \frac{1}{N^2} \sum_{i=1}^n \sum_{j=1}^n \left( \check{\Delta}_{ij} \check{y}_i \check{y}_j \right)  \\
& = \frac{1}{N^2} \sum_{i=1}^n \left( \check{\Delta}_{ii} \check{y}_i \check{y}_i \right)  \\
& = \frac{1}{N^2} \sum_{i=1}^n \left( (1- \pi_i) \frac{y_i}{\pi_i} \frac{y_i}{\pi_i} \right)  \\
& = \frac{1}{N^2} \sum_{i=1}^n \left( w_i y_i \right)^2
\end{align}</math>
}}