Editing Statistical inference (section)

====Approximate distributions====
{{Main|Statistical distance|Asymptotic theory (statistics)|Approximation theory}}
Given the difficulty in specifying exact distributions of sample statistics, many methods have been developed for approximating these.

With finite samples, [[Approximation theory|approximation results]] measure how close a limiting distribution approaches the statistic's [[sample distribution]]: For example, with 10,000 independent samples the [[normal distribution]] approximates (to two digits of accuracy) the distribution of the [[sample mean]] for many population distributions, by the [[Berry–Esseen theorem]].<ref name="JHJ2">Jörgen Hoffman-Jörgensen's ''Probability With a View Towards Statistics'', Volume I. Page 399 {{full citation needed|date=November 2012}}</ref> Yet for many practical purposes, the normal approximation provides a good approximation to the sample-mean's distribution when there are 10 (or more) independent samples, according to simulation studies and statisticians' experience.<ref name="JHJ2" /> Following Kolmogorov's work in the 1950s, advanced statistics uses [[approximation theory]] and [[functional analysis]] to quantify the error of approximation. In this approach, the [[metric geometry]] of [[probability distribution]]s is studied; this approach quantifies approximation error with, for example, the [[Kullback–Leibler divergence]], [[Bregman divergence]], and the [[Hellinger distance]].<ref>Le Cam (1986) {{page needed|date=June 2011}}</ref><ref>Erik Torgerson (1991) ''Comparison of Statistical Experiments'', volume 36 of Encyclopedia of Mathematics. Cambridge University Press. {{full citation needed|date=November 2012}}</ref><ref>{{cite book |author1=Liese, Friedrich |title=Statistical Decision Theory: Estimation, Testing, and Selection |author2=Miescke, Klaus-J. |publisher=Springer |year=2008 |isbn=978-0-387-73193-3 |name-list-style=amp}}</ref>

With indefinitely large samples, [[Asymptotic theory (statistics)|limiting results]] like the [[central limit theorem]] describe the sample statistic's limiting distribution if one exists. Limiting results are not statements about finite samples, and indeed are irrelevant to finite samples.<ref>Kolmogorov (1963, p.369): "The frequency concept, <!-- comma missing in original --> based on the notion of limiting frequency as the number of trials increases to infinity, does not contribute anything to substantiate the applicability of the results of probability theory to real practical problems where we have always to deal with a finite number of trials".</ref><ref>"Indeed, limit theorems 'as&nbsp;<math>n</math> tends to infinity' are logically devoid of content about what happens at any particular&nbsp;<math>n</math>. All they can do is suggest certain approaches whose performance must then be checked on the case at hand." — Le Cam (1986) (page xiv)</ref><ref>Pfanzagl (1994): "The crucial drawback of asymptotic theory: What we expect from asymptotic theory are results which hold approximately . . . . What asymptotic theory has to offer are limit theorems."(page ix) "What counts for applications are approximations, not limits." (page 188)</ref> However, the asymptotic theory of limiting distributions is often invoked for work with finite samples. For example, limiting results are often invoked to justify the [[generalized method of moments]] and the use of [[generalized estimating equation]]s, which are popular in [[econometrics]] and [[biostatistics]]. The magnitude of the difference between the limiting distribution and the true distribution (formally, the 'error' of the approximation) can be assessed using simulation<!-- and approximation results -->.<ref>Pfanzagl (1994) : "By taking a limit theorem as being approximately true for large sample sizes, we commit an error the size of which is unknown. [. . .] Realistic information about the remaining errors may be obtained by simulations." (page ix)</ref> The heuristic application of limiting results to finite samples is common practice in many applications, especially with low-dimensional [[Statistical model|models]] with [[Logarithmically concave function|log-concave]] [[Likelihood function|likelihoods]] (such as with one-parameter [[exponential families]]).