Editing Statistical inference (section)

=== Importance of valid models/assumptions ===
{{See also|Statistical model validation}}
[[File:Normality_Histogram.png|thumb|The above image shows a histogram assessing the assumption of normality, which can be illustrated through the even spread underneath the bell curve.]]
Whatever level of assumption is made, correctly calibrated inference, in general, requires these assumptions to be correct; i.e. that the data-generating mechanisms really have been correctly specified.

Incorrect assumptions of [[Simple random sample|'simple' random sampling]] can invalidate statistical inference.<ref>Kruskal 1988</ref> More complex semi- and fully parametric assumptions are also cause for concern. For example, incorrectly assuming the Cox model can in some cases lead to faulty conclusions.<ref>[[David A. Freedman|Freedman, D.A.]] (2008) "Survival analysis: An Epidemiological hazard?". ''The American Statistician'' (2008) 62: 110-119. (Reprinted as Chapter 11 (pages 169–192) of Freedman (2010)).</ref> Incorrect assumptions of Normality in the population also invalidates some forms of regression-based inference.<ref>Berk, R. (2003) ''Regression Analysis: A Constructive Critique (Advanced Quantitative Techniques in the Social Sciences) (v. 11)'' Sage Publications. {{ISBN|0-7619-2904-5}}</ref> The use of '''any''' parametric model is viewed skeptically by most experts in sampling human populations: "most sampling statisticians, when they deal with confidence intervals at all, limit themselves to statements about [estimators] based on very large samples, where the central limit theorem ensures that these [estimators] will have distributions that are nearly normal."<ref name="Brewer2">{{cite book |last=Brewer |first=Ken |title=Combined Survey Sampling Inference: Weighing of Basu's Elephants |publisher=Hodder Arnold |year=2002 |isbn=978-0340692295 |page=6}}</ref> In particular, a normal distribution "would be a totally unrealistic and catastrophically unwise assumption to make if we were dealing with any kind of economic population."<ref name="Brewer2" /> Here, the central limit theorem states that the distribution of the sample mean "for very large samples" is approximately normally distributed, if the distribution is not heavy-tailed.

====Approximate distributions====
{{Main|Statistical distance|Asymptotic theory (statistics)|Approximation theory}}
Given the difficulty in specifying exact distributions of sample statistics, many methods have been developed for approximating these.

With finite samples, [[Approximation theory|approximation results]] measure how close a limiting distribution approaches the statistic's [[sample distribution]]: For example, with 10,000 independent samples the [[normal distribution]] approximates (to two digits of accuracy) the distribution of the [[sample mean]] for many population distributions, by the [[Berry–Esseen theorem]].<ref name="JHJ2">Jörgen Hoffman-Jörgensen's ''Probability With a View Towards Statistics'', Volume I. Page 399 {{full citation needed|date=November 2012}}</ref> Yet for many practical purposes, the normal approximation provides a good approximation to the sample-mean's distribution when there are 10 (or more) independent samples, according to simulation studies and statisticians' experience.<ref name="JHJ2" /> Following Kolmogorov's work in the 1950s, advanced statistics uses [[approximation theory]] and [[functional analysis]] to quantify the error of approximation. In this approach, the [[metric geometry]] of [[probability distribution]]s is studied; this approach quantifies approximation error with, for example, the [[Kullback–Leibler divergence]], [[Bregman divergence]], and the [[Hellinger distance]].<ref>Le Cam (1986) {{page needed|date=June 2011}}</ref><ref>Erik Torgerson (1991) ''Comparison of Statistical Experiments'', volume 36 of Encyclopedia of Mathematics. Cambridge University Press. {{full citation needed|date=November 2012}}</ref><ref>{{cite book |author1=Liese, Friedrich |title=Statistical Decision Theory: Estimation, Testing, and Selection |author2=Miescke, Klaus-J. |publisher=Springer |year=2008 |isbn=978-0-387-73193-3 |name-list-style=amp}}</ref>

With indefinitely large samples, [[Asymptotic theory (statistics)|limiting results]] like the [[central limit theorem]] describe the sample statistic's limiting distribution if one exists. Limiting results are not statements about finite samples, and indeed are irrelevant to finite samples.<ref>Kolmogorov (1963, p.369): "The frequency concept, <!-- comma missing in original --> based on the notion of limiting frequency as the number of trials increases to infinity, does not contribute anything to substantiate the applicability of the results of probability theory to real practical problems where we have always to deal with a finite number of trials".</ref><ref>"Indeed, limit theorems 'as&nbsp;<math>n</math> tends to infinity' are logically devoid of content about what happens at any particular&nbsp;<math>n</math>. All they can do is suggest certain approaches whose performance must then be checked on the case at hand." — Le Cam (1986) (page xiv)</ref><ref>Pfanzagl (1994): "The crucial drawback of asymptotic theory: What we expect from asymptotic theory are results which hold approximately . . . . What asymptotic theory has to offer are limit theorems."(page ix) "What counts for applications are approximations, not limits." (page 188)</ref> However, the asymptotic theory of limiting distributions is often invoked for work with finite samples. For example, limiting results are often invoked to justify the [[generalized method of moments]] and the use of [[generalized estimating equation]]s, which are popular in [[econometrics]] and [[biostatistics]]. The magnitude of the difference between the limiting distribution and the true distribution (formally, the 'error' of the approximation) can be assessed using simulation<!-- and approximation results -->.<ref>Pfanzagl (1994) : "By taking a limit theorem as being approximately true for large sample sizes, we commit an error the size of which is unknown. [. . .] Realistic information about the remaining errors may be obtained by simulations." (page ix)</ref> The heuristic application of limiting results to finite samples is common practice in many applications, especially with low-dimensional [[Statistical model|models]] with [[Logarithmically concave function|log-concave]] [[Likelihood function|likelihoods]] (such as with one-parameter [[exponential families]]).