Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Bayesian inference
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Bayesian prediction=== *The [[posterior predictive distribution]] is the distribution of a new data point, marginalized over the posterior: <math display="block">p(\tilde{x} \mid \mathbf{X},\alpha) = \int p(\tilde{x} \mid \theta) p(\theta \mid \mathbf{X},\alpha) d\theta</math> *The [[prior predictive distribution]] is the distribution of a new data point, marginalized over the prior: <math display="block">p(\tilde{x} \mid \alpha) = \int p(\tilde{x} \mid \theta) p(\theta \mid \alpha) d\theta</math> Bayesian theory calls for the use of the posterior predictive distribution to do [[predictive inference]], i.e., to [[prediction|predict]] the distribution of a new, unobserved data point. That is, instead of a fixed point as a prediction, a distribution over possible points is returned. Only this way is the entire posterior distribution of the parameter(s) used. By comparison, prediction in [[frequentist statistics]] often involves finding an optimum point estimate of the parameter(s)—e.g., by [[maximum likelihood]] or [[maximum a posteriori estimation]] (MAP)—and then plugging this estimate into the formula for the distribution of a data point. This has the disadvantage that it does not account for any uncertainty in the value of the parameter, and hence will underestimate the [[variance]] of the predictive distribution. In some instances, frequentist statistics can work around this problem. For example, [[confidence interval]]s and [[prediction interval]]s in frequentist statistics when constructed from a [[normal distribution]] with unknown [[mean]] and [[variance]] are constructed using a [[Student's t-distribution]]. This correctly estimates the variance, due to the facts that (1) the average of normally distributed random variables is also normally distributed, and (2) the predictive distribution of a normally distributed data point with unknown mean and variance, using conjugate or uninformative priors, has a Student's t-distribution. In Bayesian statistics, however, the posterior predictive distribution can always be determined exactly—or at least to an arbitrary level of precision when numerical methods are used. Both types of predictive distributions have the form of a [[compound probability distribution]] (as does the [[marginal likelihood]]). In fact, if the prior distribution is a [[conjugate prior]], such that the prior and posterior distributions come from the same family, it can be seen that both prior and posterior predictive distributions also come from the same family of compound distributions. The only difference is that the posterior predictive distribution uses the updated values of the hyperparameters (applying the Bayesian update rules given in the [[conjugate prior]] article), while the prior predictive distribution uses the values of the hyperparameters that appear in the prior distribution.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)