Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Conjugate prior
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== When likelihood function is a continuous distribution === {| class="wikitable" ! Likelihood <br> <math>p(x_i|\theta)</math>!! Model parameters <br> <math>\theta</math>!! Conjugate prior (and posterior) distribution <math>p(\theta|\Theta), p(\theta|\mathbf{x},\Theta) = p(\theta|\Theta') </math>!! Prior hyperparameters <br><math>\Theta</math>!! Posterior hyperparameters<ref name="posterior-hyperparameters" group="note" /><br><math>\Theta'</math>!!Interpretation of hyperparameters!!Posterior predictive<ref name="ppredNt" group="note" /><br><math>p(\tilde{x}|\mathbf{x}, \Theta) = p(\tilde{x}|\Theta')</math> |- | [[normal distribution|Normal]]<br>with known variance ''σ''<sup>2</sup> || ''μ'' (mean) || [[normal distribution|Normal]] || <math>\mu_0,\, \sigma_0^2\!</math>|| <math>\frac{1}{\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}}\left(\frac{\mu_0}{\sigma_0^2} + \frac{\sum_{i=1}^n x_i}{\sigma^2}\right), \left(\frac{1}{\sigma_0^2} + \frac{n}{\sigma^2}\right)^{-1}</math> | mean was estimated from observations with total precision (sum of all individual precisions) <math>1/\sigma_0^2</math> and with sample mean <math>\mu_0</math> | <math>\mathcal{N}(\tilde{x}|\mu_0', {\sigma_0^2}' +\sigma^2)</math><ref name="murphy">{{citation |last=Murphy |first=Kevin P. |title=Conjugate Bayesian analysis of the Gaussian distribution |url=http://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf |year=2007}}</ref> |- | [[normal distribution|Normal]]<br>with known precision ''τ'' || ''μ'' (mean) || [[normal distribution|Normal]] || <math>\mu_0,\, \tau_0^{-1}\!</math>|| <math> \frac{\tau_0 \mu_0 + \tau \sum_{i=1}^n x_i}{\tau_0 + n \tau},\, \left(\tau_0 + n \tau\right)^{-1}</math> | mean was estimated from observations with total precision (sum of all individual precisions)<math>\tau_0</math> and with sample mean <math>\mu_0</math> | <math>\mathcal{N}\left(\tilde{x}\mid\mu_0', \frac{1}{\tau_0'} +\frac{1}{\tau}\right)</math><ref name="murphy" /> |- | [[Normal distribution|Normal]]<br>with known mean ''μ'' || ''σ''<sup>2</sup> (variance) || [[Inverse gamma distribution|Inverse gamma]] || <math> \mathbf{\alpha,\, \beta} </math> <ref name="beta_scale" group="note" />|| <math> \mathbf{\alpha}+\frac{n}{2},\, \mathbf{\beta} + \frac{\sum_{i=1}^n{(x_i-\mu)^2}}{2} </math> | variance was estimated from <math>2\alpha</math> observations with sample variance <math>\beta/\alpha</math> (i.e. with sum of [[squared deviations]] <math>2\beta</math>, where deviations are from known mean <math>\mu</math>) | <math>t_{2\alpha'}(\tilde{x}|\mu,\sigma^2 = \beta'/\alpha')</math><ref name="murphy" /> |- | [[normal distribution|Normal]]<br>with known mean ''μ'' || ''σ''<sup>2</sup> (variance) || [[Scaled inverse chi-squared distribution|Scaled inverse chi-squared]] || <math>\nu,\, \sigma_0^2\!</math>|| <math>\nu+n,\, \frac{\nu\sigma_0^2 + \sum_{i=1}^n (x_i-\mu)^2}{\nu+n}\!</math> | variance was estimated from <math>\nu</math> observations with sample variance <math>\sigma_0^2</math> | <math>t_{\nu'}(\tilde{x}|\mu,{\sigma_0^2}')</math><ref name="murphy" /> |- | [[normal distribution|Normal]]<br>with known mean ''μ'' || ''τ'' (precision) || [[Gamma distribution|Gamma]] || <math>\alpha,\, \beta\!</math> <ref name="beta_rate" group="note" />|| <math>\alpha + \frac{n}{2},\, \beta + \frac{\sum_{i=1}^n (x_i-\mu)^2}{2}\!</math> | precision was estimated from <math>2\alpha</math> observations with sample variance <math>\beta/\alpha</math> (i.e. with sum of [[squared deviations]] <math>2\beta</math>, where deviations are from known mean <math>\mu</math>) | <math>t_{2\alpha'}(\tilde{x}\mid\mu,\sigma^2 = \beta'/\alpha')</math><ref name="murphy" /> |- | [[Normal distribution|Normal]]<ref group="note">A different conjugate prior for unknown mean and variance, but with a fixed, linear relationship between them, is found in the [[normal variance-mean mixture]], with the [[Generalized inverse Gaussian distribution|generalized inverse Gaussian]] as conjugate mixing distribution.</ref>|| ''μ'' and ''σ<sup>2</sup>''<br>Assuming [[Exchangeable random variables|exchangeability]]|| [[Normal-inverse gamma distribution|Normal-inverse gamma]] | <math> \mu_0 ,\, \nu ,\, \alpha ,\, \beta</math>|| <math>\frac{\nu\mu_0+n\bar{x}}{\nu+n} ,\, \nu+n,\, \alpha+\frac{n}{2} ,\, </math><br/><math> \beta + \tfrac{1}{2} \sum_{i=1}^n (x_i - \bar{x})^2 + \frac{n\nu}{\nu+n}\frac{(\bar{x}-\mu_0)^2}{2} </math> *<math> \bar{x} </math> is the sample mean | mean was estimated from <math>\nu</math> observations with sample mean <math>\mu_0</math>; variance was estimated from <math>2\alpha</math> observations with sample mean <math>\mu_0</math> and sum of [[squared deviations]] <math>2\beta</math> | <math>t_{2\alpha'}\left(\tilde{x}\mid\mu',\frac{\beta'(\nu'+1)}{\nu' \alpha'}\right)</math><ref name="murphy" /> |- | [[Normal distribution|Normal]] || ''μ'' and ''τ''<br>Assuming [[Exchangeable random variables|exchangeability]]|| [[Normal-gamma distribution|Normal-gamma]] | <math> \mu_0 ,\, \nu ,\, \alpha ,\, \beta</math>|| <math>\frac{\nu\mu_0+n\bar{x}}{\nu+n} ,\, \nu+n,\, \alpha+\frac{n}{2} ,\, </math><br/><math> \beta + \tfrac{1}{2} \sum_{i=1}^n (x_i - \bar{x})^2 + \frac{n\nu}{\nu+n}\frac{(\bar{x}-\mu_0)^2}{2} </math> *<math> \bar{x} </math> is the sample mean | mean was estimated from <math>\nu</math> observations with sample mean <math>\mu_0</math>, and precision was estimated from <math>2\alpha</math> observations with sample mean <math>\mu_0</math> and sum of [[squared deviations]] <math>2\beta</math> | <math>t_{2\alpha'}\left(\tilde{x}\mid\mu',\frac{\beta'(\nu'+1)}{\alpha'\nu'}\right)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] with known covariance matrix '''''Σ''''' || '''''μ''''' (mean vector) || [[multivariate normal distribution|Multivariate normal]] || <math>\boldsymbol{\boldsymbol\mu}_0,\, \boldsymbol\Sigma_0</math>|| <math>\left(\boldsymbol\Sigma_0^{-1} + n\boldsymbol\Sigma^{-1}\right)^{-1}\left( \boldsymbol\Sigma_0^{-1}\boldsymbol\mu_0 + n \boldsymbol\Sigma^{-1} \mathbf{\bar{x}} \right),</math><br/><math>\left(\boldsymbol\Sigma_0^{-1} + n\boldsymbol\Sigma^{-1}\right)^{-1}</math> *<math>\mathbf{\bar{x}}</math> is the sample mean | mean was estimated from observations with total precision (sum of all individual precisions)<math>\boldsymbol\Sigma_0^{-1}</math> and with sample mean <math>\boldsymbol\mu_0</math> | <math>\mathcal{N}(\tilde{\mathbf{x}}\mid{\boldsymbol\mu_0}', {\boldsymbol\Sigma_0}' +\boldsymbol\Sigma)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] with known precision matrix '''''Λ''''' || '''''μ''''' (mean vector) || [[multivariate normal distribution|Multivariate normal]] || <math>\mathbf{\boldsymbol\mu}_0,\, \boldsymbol\Lambda_0</math>|| <math>\left(\boldsymbol\Lambda_0 + n\boldsymbol\Lambda\right)^{-1}\left( \boldsymbol\Lambda_0\boldsymbol\mu_0 + n \boldsymbol\Lambda \mathbf{\bar{x}} \right),\, \left(\boldsymbol\Lambda_0 + n\boldsymbol\Lambda\right)</math> *<math>\mathbf{\bar{x}}</math> is the sample mean | mean was estimated from observations with total precision (sum of all individual precisions)<math>\boldsymbol\Lambda_0</math> and with sample mean <math>\boldsymbol\mu_0</math> | <math>\mathcal{N}\left(\tilde{\mathbf{x}}\mid{\boldsymbol\mu_0}', {{\boldsymbol\Lambda_0}'}^{-1} + \boldsymbol\Lambda^{-1}\right)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] with known mean '''''μ''''' || '''''Σ''''' (covariance matrix) || [[Inverse-Wishart distribution|Inverse-Wishart]] || <math>\nu ,\, \boldsymbol\Psi</math>|| <math>n+\nu ,\, \boldsymbol\Psi + \sum_{i=1}^n (\mathbf{x_i} - \boldsymbol\mu) (\mathbf{x_i} - \boldsymbol\mu)^T </math> | covariance matrix was estimated from <math>\nu</math> observations with sum of pairwise deviation products <math>\boldsymbol\Psi</math> | <math>t_{\nu'-p+1}\left(\tilde{\mathbf{x}}|\boldsymbol\mu,\frac{1}{\nu'-p+1}\boldsymbol\Psi'\right)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] with known mean '''''μ''''' || '''''Λ''''' (precision matrix) || [[Wishart distribution|Wishart]] || <math>\nu ,\, \mathbf{V}</math>|| <math>n+\nu ,\, \left(\mathbf{V}^{-1} + \sum_{i=1}^n (\mathbf{x_i} - \boldsymbol\mu) (\mathbf{x_i} - \boldsymbol\mu)^T\right)^{-1} </math> | covariance matrix was estimated from <math>\nu</math> observations with sum of pairwise deviation products <math>\mathbf{V}^{-1}</math> | <math>t_{\nu'-p+1}\left(\tilde{\mathbf{x}}\mid\boldsymbol\mu,\frac{1}{\nu'-p+1}{\mathbf{V}'}^{-1}\right)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] || '''''μ''''' (mean vector) and '''''Σ''''' (covariance matrix) || [[normal-inverse-Wishart distribution|normal-inverse-Wishart]] || <math>\boldsymbol\mu_0 ,\, \kappa_0 ,\, \nu_0 ,\, \boldsymbol\Psi</math>|| <math>\frac{\kappa_0\boldsymbol\mu_0+n\mathbf{\bar{x}}}{\kappa_0+n} ,\, \kappa_0+n,\, \nu_0+n ,\,</math><br/><math> \boldsymbol\Psi + \mathbf{C} + \frac{\kappa_0 n}{\kappa_0+n}(\mathbf{\bar{x}}-\boldsymbol\mu_0)(\mathbf{\bar{x}}-\boldsymbol\mu_0)^T </math> *<math> \mathbf{\bar{x}} </math> is the sample mean *<math>\mathbf{C} = \sum_{i=1}^n (\mathbf{x_i} - \mathbf{\bar{x}}) (\mathbf{x_i} - \mathbf{\bar{x}})^T</math> | mean was estimated from <math>\kappa_0</math> observations with sample mean <math>\boldsymbol\mu_0</math>; covariance matrix was estimated from <math>\nu_0</math> observations with sample mean <math>\boldsymbol\mu_0</math> and with sum of pairwise deviation products <math>\boldsymbol\Psi=\nu_0\boldsymbol\Sigma_0</math> | <math>t_{{\nu_0}'-p+1}\left(\tilde{\mathbf{x}}|{\boldsymbol\mu_0}',\frac{{\kappa_0}'+1}{{\kappa_0}'({\nu_0}'-p+1)}\boldsymbol\Psi'\right)</math><ref name="murphy" /> |- | [[multivariate normal distribution|Multivariate normal]] || '''''μ''''' (mean vector) and '''''Λ''''' (precision matrix)|| [[normal-Wishart distribution|normal-Wishart]] || <math>\boldsymbol\mu_0 ,\, \kappa_0 ,\, \nu_0 ,\, \mathbf{V}</math>|| <math>\frac{\kappa_0\boldsymbol\mu_0+n\mathbf{\bar{x}}}{\kappa_0+n} ,\, \kappa_0+n,\, \nu_0+n ,\,</math><br/><math> \left(\mathbf{V}^{-1} + \mathbf{C} + \frac{\kappa_0 n}{\kappa_0+n}(\mathbf{\bar{x}}-\boldsymbol\mu_0)(\mathbf{\bar{x}}-\boldsymbol\mu_0)^T\right)^{-1} </math> *<math> \mathbf{\bar{x}} </math> is the sample mean *<math>\mathbf{C} = \sum_{i=1}^n (\mathbf{x_i} - \mathbf{\bar{x}}) (\mathbf{x_i} - \mathbf{\bar{x}})^T</math> | mean was estimated from <math>\kappa_0</math> observations with sample mean <math>\boldsymbol\mu_0</math>; covariance matrix was estimated from <math>\nu_0</math> observations with sample mean <math>\boldsymbol\mu_0</math> and with sum of pairwise deviation products <math>\mathbf{V}^{-1}</math> | <math>t_{{\nu_0}'-p+1}\left(\tilde{\mathbf{x}}\mid {\boldsymbol\mu_0}', \frac{{\kappa_0}'+1}{{\kappa_0}'({\nu_0}'-p+1)}{\mathbf{V}'}^{-1}\right)</math><ref name="murphy" /> |- | [[Uniform distribution (continuous)|Uniform]] || <math> U(0,\theta)\!</math>|| [[Pareto distribution|Pareto]] || <math> x_{m},\, k\!</math>|| <math> \max\{\,x_1,\ldots,x_n,x_\mathrm{m}\},\, k+n\!</math> | <math>k</math> observations with maximum value <math>x_m</math> | |- | [[Pareto distribution|Pareto]] <br/>with known minimum ''x''<sub>''m''</sub> || ''k'' (shape) || [[Gamma distribution|Gamma]] || <math>\alpha,\, \beta\!</math>|| <math>\alpha+n,\, \beta+\sum_{i=1}^n \ln\frac{x_i}{x_{\mathrm{m}}}\!</math> | <math>\alpha</math> observations with sum <math>\beta</math> of the [[order of magnitude]] of each observation (i.e. the logarithm of the ratio of each observation to the minimum <math>x_m</math>) | |- | [[Weibull distribution|Weibull]] <br/>with known shape ''β'' || ''θ'' (scale) || [[inverse-gamma distribution|Inverse gamma]]<ref name="Fink" />|| <math>a, b\!</math>|| <math>a+n,\, b+\sum_{i=1}^n x_i^{\beta}\!</math> | <math>a</math> observations with sum <math>b</math> of the ''β'''th power of each observation | |- | [[log-normal distribution|Log-normal]] | colspan="6" | Same as for the normal distribution after applying the natural logarithm to the data for the posterior hyperparameters. Please refer to {{harvtxt|Fink|1997|pp=21–22}} to see the details. |- | [[exponential distribution|Exponential]] || ''λ'' (rate) || [[Gamma distribution|Gamma]] || <math>\alpha,\, \beta\!</math> <ref name="beta_rate" group="note" />|| <math>\alpha+n,\, \beta+\sum_{i=1}^n x_i\!</math> | <math>\alpha</math> observations that sum to <math>\beta</math> <ref>{{cite book |last1=Liu |first1=Han |url=https://www.stat.cmu.edu/~larry/=sml/Bayes.pdf#page=16 |title=Statistical Machine Learning |last2=Wasserman |first2=Larry |year=2014 |page=314}}</ref> | <math>\operatorname{Lomax}(\tilde{x}\mid\beta',\alpha')</math><br />([[Lomax distribution]]) |- | [[Gamma Distribution|Gamma]] <br>with known shape ''α''|| ''β'' (rate) || [[Gamma Distribution|Gamma]] || <math>\alpha_0,\, \beta_0\!</math>||<math>\alpha_0+n\alpha,\, \beta_0+\sum_{i=1}^n x_i\!</math> | <math>\alpha_0/\alpha</math> observations with sum <math>\beta_0</math> | <math>\operatorname{CG}(\tilde{\mathbf{x}}\mid\alpha,{\alpha_0}',{\beta_0}')=\operatorname{\beta'}(\tilde{\mathbf{x}}|\alpha,{\alpha_0}',1,{\beta_0}')</math> <ref name="CG" group="note" /> |- | [[Inverse-gamma distribution|Inverse Gamma]] <br>with known shape ''α''|| ''β'' (inverse scale) || [[Gamma Distribution|Gamma]] || <math>\alpha_0,\, \beta_0\!</math>||<math>\alpha_0+n\alpha,\, \beta_0+\sum_{i=1}^n \frac{1}{x_i}\!</math> | <math>\alpha_0/\alpha</math> observations with sum <math>\beta_0</math> | |- | [[Gamma Distribution|Gamma]] <br>with known rate ''β''|| ''α'' (shape) | <math>\propto \frac{a^{\alpha-1} \beta^{\alpha c}}{\Gamma(\alpha)^b}</math> | <math>a,\, b,\, c\!</math>||<math>a \prod_{i=1}^n x_i,\, b + n,\, c + n\!</math> | <math>b</math> or <math>c</math> observations (<math>b</math> for estimating <math>\alpha</math>, <math>c</math> for estimating <math>\beta</math>) with product <math>a</math> | |- | [[Gamma Distribution|Gamma]]<ref name="Fink" />|| ''α'' (shape), ''β'' (inverse scale) || <math>\propto \frac{p^{\alpha-1} e^{-\beta q}}{\Gamma(\alpha)^r \beta^{-\alpha s}}</math>|| <math>p,\, q,\, r,\, s \!</math>|| <math>p \prod_{i=1}^n x_i,\, q + \sum_{i=1}^n x_i,\, r + n,\, s + n \!</math> | <math>\alpha</math> was estimated from <math>r</math> observations with product <math>p</math>; <math>\beta</math> was estimated from <math>s</math> observations with sum <math>q</math> | |- | [[Beta Distribution|Beta]]|| ''α'', ''β'' || <math>\propto \frac{\Gamma(\alpha+\beta)^k \, p^\alpha \, q^\beta}{\Gamma(\alpha)^k\,\Gamma(\beta)^k}</math>|| <math>p,\, q,\, k \!</math>|| <math>p \prod_{i=1}^n x_i,\, q \prod_{i=1}^n (1-x_i),\, k + n \!</math> | <math>\alpha</math> and <math>\beta</math> were estimated from <math>k</math> observations with product <math>p</math> and product of the complements <math>q</math> | |}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)