Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Likelihood function
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Interpretations under different foundations=== Among statisticians, there is no consensus about what the [[Foundations of statistics|foundation of statistics]] should be. There are four main paradigms that have been proposed for the foundation: [[frequentism]], [[Bayesianism]], [[likelihoodism]], and [[Akaike information criterion|AIC-based]].<ref name="BF11">{{Citation |editor1-last= Bandyopadhyay |editor1-first= P. S. |editor-first2= M. R. |editor-last2= Forster | title = Philosophy of Statistics | publisher= [[North-Holland Publishing]] | year = 2011 |mode=cs1 }}</ref> For each of the proposed foundations, the interpretation of likelihood is different. The four interpretations are described in the subsections below. ====Frequentist interpretation==== {{empty section|date=March 2019}} ====Bayesian interpretation==== In [[Bayesian inference]], although one can speak about the likelihood of any proposition or [[random variable]] given another random variable: for example the likelihood of a parameter value or of a [[statistical model]] (see [[marginal likelihood]]), given specified data or other evidence,<ref name='good1950'>I. J. Good: ''Probability and the Weighing of Evidence'' (Griffin 1950), §6.1</ref><ref name='jeffreys1983'>H. Jeffreys: ''Theory of Probability'' (3rd ed., Oxford University Press 1983), §1.22</ref><ref name='jaynes2003'>E. T. Jaynes: ''Probability Theory: The Logic of Science'' (Cambridge University Press 2003), §4.1</ref><ref name='lindley1980'>D. V. Lindley: ''Introduction to Probability and Statistics from a Bayesian Viewpoint. Part 1: Probability'' (Cambridge University Press 1980), §1.6</ref> the likelihood function remains the same entity, with the additional interpretations of (i) a [[Conditional probability distribution|conditional density]] of the data given the parameter (since the parameter is then a random variable) and (ii) a measure or amount of information brought by the data about the parameter value or even the model.<ref name='good1950'/><ref name='jeffreys1983'/><ref name='jaynes2003'/><ref name='lindley1980'/><ref name='gelmanetal2014'>A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin: ''Bayesian Data Analysis'' (3rd ed., Chapman & Hall/CRC 2014), §1.3</ref> Due to the introduction of a probability structure on the parameter space or on the collection of models, it is possible that a parameter value or a statistical model have a large likelihood value for given data, and yet have a low ''probability'', or vice versa.<ref name='jaynes2003'/><ref name='gelmanetal2014'/> This is often the case in medical contexts.<ref>{{citation |first1=H. C. |last1=Sox |first2=M. C. |last2=Higgins |first3=D. K. |last3=Owens |title=Medical Decision Making |edition=2nd |publisher=Wiley |year=2013 |doi=10.1002/9781118341544 |isbn=9781118341544 |at=chapters 3–4 }}</ref> Following [[Bayes' Rule]], the likelihood when seen as a conditional density can be multiplied by the [[prior probability]] density of the parameter and then normalized, to give a [[posterior probability]] density.<ref name='good1950'/><ref name='jeffreys1983'/><ref name='jaynes2003'/><ref name='lindley1980'/><ref name="gelmanetal2014"/> More generally, the likelihood of an unknown quantity <math display="inline">X</math> given another unknown quantity <math display="inline">Y</math> is proportional to the ''probability of <math display="inline">Y</math> given <math display="inline">X</math>''.<ref name='good1950'/><ref name='jeffreys1983'/><ref name='jaynes2003'/><ref name='lindley1980'/><ref name='gelmanetal2014'/> ====Likelihoodist interpretation==== {{more footnotes needed|date=April 2019}} In frequentist statistics, the likelihood function is itself a [[statistic]] that summarizes a single sample from a population, whose calculated value depends on a choice of several parameters ''θ''<sub>1</sub> ... ''θ''<sub>p</sub>, where ''p'' is the count of parameters in some already-selected [[statistical model]]. The value of the likelihood serves as a figure of merit for the choice used for the parameters, and the parameter set with maximum likelihood is the best choice, given the data available. The specific calculation of the likelihood is the probability that the observed sample would be assigned, assuming that the model chosen and the values of the several parameters '''''θ''''' give an accurate approximation of the [[frequency distribution]] of the population that the observed sample was drawn from. Heuristically, it makes sense that a good choice of parameters is those which render the sample actually observed the maximum possible ''post-hoc'' probability of having happened. [[Wilks' theorem]] quantifies the heuristic rule by showing that the difference in the logarithm of the likelihood generated by the estimate's parameter values and the logarithm of the likelihood generated by population's "true" (but unknown) parameter values is asymptotically [[chi-squared distribution|χ<sup>2</sup> distributed]]. Each independent sample's maximum likelihood estimate is a separate estimate of the "true" parameter set describing the population sampled. Successive estimates from many independent samples will cluster together with the population's "true" set of parameter values hidden somewhere in their midst. The difference in the logarithms of the maximum likelihood and adjacent parameter sets' likelihoods may be used to draw a [[confidence region]] on a plot whose co-ordinates are the parameters ''θ''<sub>1</sub> ... ''θ''<sub>p</sub>. The region surrounds the maximum-likelihood estimate, and all points (parameter sets) within that region differ at most in log-likelihood by some fixed value. The [[chi-squared distribution|χ<sup>2</sup> distribution]] given by [[Wilks' theorem]] converts the region's log-likelihood differences into the "confidence" that the population's "true" parameter set lies inside. The art of choosing the fixed log-likelihood difference is to make the confidence acceptably high while keeping the region acceptably small (narrow range of estimates). As more data are observed, instead of being used to make independent estimates, they can be combined with the previous samples to make a single combined sample, and that large sample may be used for a new maximum likelihood estimate. As the size of the combined sample increases, the size of the likelihood region with the same confidence shrinks. Eventually, either the size of the confidence region is very nearly a single point, or the entire population has been sampled; in both cases, the estimated parameter set is essentially the same as the population parameter set. ====AIC-based interpretation==== {{expand section|date=March 2019}} Under the [[Akaike information criterion|AIC]] paradigm, likelihood is interpreted within the context of [[information theory]].<ref>{{Citation | first=H. |last=Akaike |author-link=Hirotugu Akaike | contribution = Prediction and entropy | pages=1–24 | title= A Celebration of Statistics | editor1-first= A. C. | editor1-last= Atkinson | editor2-first= S. E. | editor2-last= Fienberg | editor2-link= Stephen Fienberg | year = 1985 | publisher= Springer |mode=cs1 }}</ref><ref>{{Citation | author1-first= Y. | author1-last= Sakamoto | author2-first= M. | author2-last= Ishiguro | author3-first= G. | author3-last= Kitagawa | title= Akaike Information Criterion Statistics | year= 1986 | publisher= [[D. Reidel]] | at= Part I |mode=cs1 }}</ref><ref>{{Citation |last1=Burnham |first1=K. P. |last2=Anderson |first2=D. R. |year=2002 |title=Model Selection and Multimodel Inference: A practical information-theoretic approach |edition=2nd |publisher= [[Springer-Verlag]] | at= chap. 7 |mode=cs1 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)