Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Prior probability
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Improper priors == Let events <math>A_1, A_2, \ldots, A_n</math> be mutually exclusive and exhaustive. If Bayes' theorem is written as <math display="block">P(A_i\mid B) = \frac{P(B \mid A_i) P(A_i)}{\sum_j P(B\mid A_j)P(A_j)}\, ,</math> then it is clear that the same result would be obtained if all the prior probabilities ''P''(''A''<sub>''i''</sub>) and ''P''(''A''<sub>''j''</sub>) were multiplied by a given constant; the same would be true for a [[continuous random variable]]. If the summation in the denominator converges, the posterior probabilities will still sum (or integrate) to 1 even if the prior values do not, and so the priors may only need to be specified in the correct proportion. Taking this idea further, in many cases the sum or integral of the prior values may not even need to be finite to get sensible answers for the posterior probabilities. When this is the case, the prior is called an '''improper prior'''. However, the posterior distribution need not be a proper distribution if the prior is improper.<ref>{{cite journal |first1=A. P. |last1=Dawid |first2=M. |last2=Stone |first3=J. V. |last3=Zidek |title=Marginalization Paradoxes in Bayesian and Structural Inference |journal=Journal of the Royal Statistical Society |series=Series B (Methodological) |volume=35 |issue=2 |year=1973 |pages=189β233 |doi=10.1111/j.2517-6161.1973.tb00952.x |jstor=2984907 }}</ref> This is clear from the case where event ''B'' is independent of all of the ''A''<sub>''j''</sub>. Statisticians sometimes use improper priors as [[uninformative prior]]s.<ref>{{cite book|last1=Christensen|first1=Ronald|last2=Johnson|first2=Wesley|last3=Branscum|first3=Adam|last4=Hanson|first4=Timothy E.|title=Bayesian Ideas and Data Analysis : An Introduction for Scientists and Statisticians|date=2010|publisher=CRC Press|location=Hoboken|isbn=9781439894798|page=69}}</ref> For example, if they need a prior distribution for the mean and variance of a random variable, they may assume ''p''(''m'', ''v'') ~ 1/''v'' (for ''v'' > 0) which would suggest that any value for the mean is "equally likely" and that a value for the positive variance becomes "less likely" in inverse proportion to its value. Many authors (Lindley, 1973; De Groot, 1937; Kass and Wasserman, 1996){{Citation needed|date=December 2008}} warn against the danger of over-interpreting those priors since they are not probability densities. The only relevance they have is found in the corresponding posterior, as long as it is well-defined for all observations. (The [[Beta distribution#Haldane.27s prior probability .28Beta.280.2C0.29.29|Haldane prior]] is a typical counterexample.{{Clarify|reason=counterexample of what?|date=May 2011}}{{Citation needed|date=May 2011}}) By contrast, [[likelihood function]]s do not need to be integrated, and a likelihood function that is uniformly 1 corresponds to the absence of data (all models are equally likely, given no data): Bayes' rule multiplies a prior by the likelihood, and an empty product is just the constant likelihood 1. However, without starting with a prior probability distribution, one does not end up getting a [[posterior probability]] distribution, and thus cannot integrate or compute expected values or loss. See {{slink|Likelihood function|Non-integrability}} for details. === Examples === Examples of improper priors include: * The [[uniform distribution (continuous)|uniform distribution]] on an infinite interval (i.e., a half-line or the entire real line). * Beta(0,0), the [[beta distribution]] for ''Ξ±''=0, ''Ξ²''=0 (uniform distribution on [[log-odds]] scale). * The logarithmic prior on the [[positive reals]] (uniform distribution on [[log scale]]).{{Citation needed|date=October 2010}} These functions, interpreted as uniform distributions, can also be interpreted as the [[likelihood function]] in the absence of data, but are not proper priors.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)