Editing Cluster sampling (section)

==More on cluster sampling==

===Two-stage cluster sampling===
Two-stage cluster sampling, a simple case of [[multistage sampling]], is obtained by selecting cluster samples in the first stage and then selecting a sample of elements from every sampled cluster. Consider a population of ''N'' clusters in total. In the first stage, ''n'' clusters are selected using the ordinary cluster sampling method. In the second stage, [[simple random sampling]] is usually used.<ref>{{cite book|last=Ahmed|first=Saifuddin|title=Methods in Sample Surveys|year=2009|publisher=The Johns Hopkins University and Saifuddin Ahmed|url=http://ocw.jhsph.edu/courses/statmethodsforsamplesurveys/PDFs/Lecture5.pdf |archive-url=https://web.archive.org/web/20130928180152/http://ocw.jhsph.edu/courses/statmethodsforsamplesurveys/PDFs/Lecture5.pdf |archive-date=2013-09-28 |url-status=live}}</ref> It is used separately in every cluster and the numbers of elements selected from different clusters are not necessarily equal. The total number of clusters ''N'', the number of clusters selected ''n'', and the numbers of elements from selected clusters need to be pre-determined by the survey designer. Two-stage cluster sampling aims at minimizing survey costs and at the same time controlling the uncertainty related to estimates of interest.<ref>{{cite book| author = Daniel Pfeffermann|author2=C. Radhakrishna Rao| title = Handbook of Statistics Vol.29A Sample Surveys: Theory, Methods and Infernece| url = https://books.google.com/books?id=waHxMgEACAAJ| year = 2009| publisher = Elsevier B.V.| isbn = 978-0-444-53124-7 }}</ref> This method can be used in health and social sciences. For instance, researchers used two-stage cluster sampling to generate a representative sample of the Iraqi population to conduct mortality surveys.<ref>{{cite journal|author=LP Galway |author2=Nathaniel Bell |author3=Al S SAE |author4=Amy Hagopian |author5=Gilbert Burnham |author6=Abraham Flaxman |author7=Wiliam M Weiss |author8=Julie Rajaratnam |author9=Tim K Takaro|title=A two-stage cluster sampling method using gridded population data, a GIS, and Google EarthTM imagery in a population-based mortality survey in Iraq|journal=International Journal of Health Geographics|volume=11 |pages=12 |date=27 April 2012|issue=1 |doi=10.1186/1476-072X-11-12 |pmid=22540266 |pmc=3490933 |doi-access=free |bibcode=2012IJHGg..11...12G }}</ref> Sampling in this method can be quicker and more reliable than other methods, which is why this method is now used frequently.

===Inference when the number of clusters is small===
Cluster sampling methods can lead to significant bias when working with a small number of clusters. For instance, it can be necessary to cluster at the state or city-level, units that may be small and fixed in number. Microeconometrics methods for panel data often use short panels, which is analogous to having few observations per clusters and many clusters. The small cluster problem can be viewed as an incidental parameter problem.<ref>Cameron A. C. and P. K. Trivedi (2005): Microeconometrics: Methods and Applications. Cambridge University Press, New York.</ref> While the point estimates can be reasonably precisely estimated, if the number of observations per cluster is sufficiently high, we need the number of clusters <math>G\rightarrow \infty</math> for the asymptotics to kick in. If the number of clusters is low the estimated covariance matrix can be downward biased.<ref name="CameronMiller">Cameron, C. and D. L. Miller (2015): A Practitioner's Guide to Cluster-Robust Inference. Journal of Human Resources 50(2), pp. 317–372.</ref>

Small numbers of clusters are a risk when there is serial correlation or when there is intraclass correlation as in the Moulton context. When having few clusters, we tend to underestimate serial correlation across observations when a random shock occurs, or the intraclass correlation in a Moulton setting.<ref name="AngristPischke">Angrist, J.D. and J.-S. Pischke (2009): Mostly Harmless Econometrics. An empiricist's companion. Princeton University Press, New Jersey.</ref> Several studies have highlighted the consequences of serial correlation and highlighted the small-cluster problem.<ref>Bertrand, M., E. Duflo and S. Mullainathan (2004): How Much Should We Trust Differences-in-Differences Estimates? Quarterly Journal of Economics 119(1), pp. 249–275.</ref><ref>Kezdi, G. (2004): Robust Standard Error Estimation in Fixed-Effect Panel Models. Hungarian Statistical Review 9, pp. 95–116.</ref>

In the framework of the Moulton factor, an intuitive explanation of the small cluster problem can be derived from the formula for the Moulton factor. Assume for simplicity that the number of observations per cluster is fixed at ''n''. Below, <math>V_{c}(\beta)</math> stands for the covariance matrix adjusted for clustering, <math>V(\beta)</math> stands for the covariance matrix not adjusted for clustering, and ρ stands for the intraclass correlation:

: <math>\frac{V_{c}(\hat\beta)}{V(\hat\beta)}=1+(n-1)\rho</math>

The ratio on the left-hand side indicates how much the unadjusted scenario overestimates the precision. Therefore, a high number means a strong downward bias of the estimated covariance matrix. A small cluster problem can be interpreted as a large n: when the data is fixed and the number of clusters is low, the number of data within a cluster can be high. It follows that inference, when the number of clusters is small, will not have the correct coverage.<ref name="AngristPischke"/>

Several solutions for the small cluster problem have been proposed. One can use a bias-corrected cluster-robust variance matrix, make T-distribution adjustments, or use bootstrap methods with asymptotic refinements, such as the percentile-t or wild bootstrap, that can lead to improved finite sample inference.<ref name="CameronMiller"/> Cameron, Gelbach and Miller (2008) provide microsimulations for different methods and find that the wild bootstrap performs well in the face of a small number of clusters.<ref>Cameron, C., J. Gelbach and D. L. Miller (2008): Bootstrap-Based Improvements for Inference with Clustered Errors. The Review of Economics and Statistics 90, pp. 414–427.</ref>