Editing Markov chain Monte Carlo (section)

== Examples ==

=== Random walk Monte Carlo methods ===
* [[Metropolis–Hastings algorithm]]: This method generates a Markov chain using a proposal density for new steps and a method for rejecting some of the proposed moves. It is actually a general framework which includes as special cases the very first and simpler MCMC (Metropolis algorithm) and many more recent variants listed below.
**[[Gibbs sampling]]: When target distribution is multi-dimensional, Gibbs sampling algorithm<ref>{{Cite journal |last1=Geman |first1=Stuart |last2=Geman |first2=Donald |date=November 1984 |title=Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images |url=https://ieeexplore.ieee.org/document/4767596 |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=PAMI-6 |issue=6 |pages=721–741 |doi=10.1109/TPAMI.1984.4767596 |pmid=22499653 |s2cid=5837272 |issn=0162-8828}}</ref> updates each coordinate from its full [[conditional distribution]] given other coordinates. Gibbs sampling can be viewed as a special case of Metropolis–Hastings algorithm with acceptance rate uniformly equal to 1. When drawing from the full conditional distributions is not straightforward other samplers-within-Gibbs are used (e.g., see <ref>{{Cite journal|title = Adaptive Rejection Sampling for Gibbs Sampling|journal = Journal of the Royal Statistical Society. Series C (Applied Statistics)|date = 1992-01-01|pages = 337–348|volume = 41|issue = 2|doi = 10.2307/2347565|first1 = W. R.|last1 = Gilks|first2 = P.|last2 = Wild|jstor=2347565}}</ref><ref>{{Cite journal|title = Adaptive Rejection Metropolis Sampling within Gibbs Sampling|journal = Journal of the Royal Statistical Society. Series C (Applied Statistics)|date = 1995-01-01|pages = 455–472|volume = 44|issue = 4|doi = 10.2307/2986138|first1 = W. R.|last1 = Gilks|first2 = N. G.|last2 = Best|author2-link= Nicky Best |first3 = K. K. C.|last3 = Tan|jstor=2986138}}</ref>). Gibbs sampling is popular partly because it does not require any 'tuning'. Algorithm structure of the Gibbs sampling highly resembles that of the coordinate ascent variational inference in that both algorithms utilize the full-conditional distributions in the updating procedure.<ref>{{Cite journal |last=Lee|first=Se Yoon|  title = Gibbs sampler and coordinate ascent variational inference: A set-theoretical review|journal=Communications in Statistics - Theory and Methods|year=2021|volume=51 |issue=6 |pages=1–21|doi=10.1080/03610926.2021.1921214|arxiv=2008.01006|s2cid=220935477}}</ref>
** [[Metropolis-adjusted Langevin algorithm]] and other methods that rely on the gradient (and possibly second derivative) of the log target density to propose steps that are more likely to be in the direction of higher probability density.<ref>See Stramer 1999.</ref>
** [[Hamiltonian Monte Carlo|Hamiltonian (or hybrid) Monte Carlo]] (HMC): Tries to avoid random walk behaviour by introducing an auxiliary [[momentum]] vector and implementing [[Hamiltonian dynamics]], so the potential energy function is the target density. The momentum samples are discarded after sampling. The result of hybrid Monte Carlo is that proposals move across the sample space in larger steps; they are therefore less correlated and converge to the target distribution more rapidly.
** [[Pseudo-marginal Metropolis–Hastings algorithm|Pseudo-marginal Metropolis–Hastings]]: This method replaces the evaluation of the density of the target distribution with an unbiased estimate and is useful when the target density is not available analytically, e.g. [[latent variable model]]s.
* [[Slice sampling]]: This method depends on the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. It alternates uniform sampling in the vertical direction with uniform sampling from the horizontal 'slice' defined by the current vertical position.
* [[Multiple-try Metropolis]]: This method is a variation of the Metropolis–Hastings algorithm that allows multiple trials at each point. By making it possible to take larger steps at each iteration, it helps address the curse of dimensionality.
* [[Reversible-jump]]: This method is a variant of the Metropolis–Hastings algorithm that allows proposals that change the dimensionality of the space.<ref>See Green 1995.</ref> Markov chain Monte Carlo methods that change dimensionality have long been used in [[statistical physics]] applications, where for some problems a distribution that is a [[grand canonical ensemble]] is used (e.g., when the number of molecules in a box is variable). But the reversible-jump variant is useful when doing Markov chain Monte Carlo or Gibbs sampling over [[nonparametric]] Bayesian models such as those involving the [[Dirichlet process]] or [[Chinese restaurant process]], where the number of mixing components/clusters/etc. is automatically inferred from the data.

=== Interacting particle methods ===
Interacting MCMC methodologies are a class of [[mean-field particle methods]] for obtaining [[Pseudo-random number sampling|random samples]] from a sequence of probability distributions with an increasing level of sampling complexity.<ref name="dp13">{{cite book|last = Del Moral|first = Pierre|title = Mean field simulation for Monte Carlo integration|year = 2013|publisher = Chapman & Hall/CRC Press |url = http://www.crcpress.com/product/isbn/9781466504059|pages = 626}}</ref> These probabilistic models include path space state models with increasing time horizon, posterior distributions w.r.t. sequence of partial observations, increasing constraint level sets for conditional distributions, decreasing temperature schedules associated with some Boltzmann–Gibbs distributions, and many others. In principle, any Markov chain Monte Carlo sampler can be turned into an interacting Markov chain Monte Carlo sampler. These interacting Markov chain Monte Carlo samplers can be interpreted as a way to run in parallel a sequence of Markov chain Monte Carlo samplers. For instance, interacting [[simulated annealing]] algorithms are based on independent Metropolis–Hastings moves interacting sequentially with a selection-resampling type mechanism. In contrast to traditional Markov chain Monte Carlo methods, the precision parameter of this class of interacting Markov chain Monte Carlo samplers is ''only'' related to the number of interacting Markov chain Monte Carlo samplers. These advanced particle methodologies belong to the class of Feynman–Kac particle models,<ref name="dp04">{{cite book|last = Del Moral|first = Pierre|title = Feynman–Kac formulae. Genealogical and interacting particle approximations|year = 2004|publisher = Springer |url = https://www.springer.com/mathematics/probability/book/978-0-387-20268-6|pages = 575}}</ref><ref name="dmm002">{{cite book|last1 = Del Moral|first1 = Pierre|last2 = Miclo|first2 = Laurent|contribution = Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae with Applications to Non-Linear Filtering|title=Séminaire de Probabilités XXXIV |editor=Jacques Azéma |editor2=Michel Ledoux |editor3=Michel Émery |editor4=Marc Yor|series = Lecture Notes in Mathematics|date = 2000|volume = 1729|pages = 1–145|url = http://archive.numdam.org/ARCHIVE/SPS/SPS_2000__34_/SPS_2000__34__1_0/SPS_2000__34__1_0.pdf|doi = 10.1007/bfb0103798|isbn = 978-3-540-67314-9}}</ref> also called Sequential Monte Carlo or [[particle filter]] methods in [[Bayesian inference]] and [[signal processing]] communities.<ref name=":3">{{Cite journal|title = Sequential Monte Carlo samplers | doi=10.1111/j.1467-9868.2006.00553.x|volume=68|issue = 3|year=2006|journal=Journal of the Royal Statistical Society. Series B (Statistical Methodology)|pages=411–436 | last1 = Del Moral | first1 = Pierre|arxiv=cond-mat/0212648| s2cid=12074789}}</ref> Interacting Markov chain Monte Carlo methods can also be interpreted as a mutation-selection [[Genetic algorithm|genetic particle algorithm]] with Markov chain Monte Carlo mutations.

=== Quasi-Monte Carlo ===

The [[quasi-Monte Carlo method]] is an analog to the normal Monte Carlo method that uses [[low-discrepancy sequence]]s instead of random numbers.<ref name="beating">{{cite journal |last1=Papageorgiou |first1=Anargyros |first2=Joseph |last2=Traub |title=Beating Monte Carlo |journal=Risk |volume=9 |issue=6 |year=1996 |pages=63–65 | url=https://iiif.library.cmu.edu/file/Traub_box00030_fld00008_bdl0001_doc0001/Traub_box00030_fld00008_bdl0001_doc0001.pdf}}</ref><ref>{{cite journal | last1 = Sobol | first1 = Ilya M | year = 1998 | title = On quasi-monte carlo integrations | journal = Mathematics and Computers in Simulation | volume = 47 | issue = 2| pages = 103–112 | doi=10.1016/s0378-4754(98)00096-2 }}</ref> It yields an integration error that decays faster than that of true random sampling, as quantified by the [[Low-discrepancy sequence#The Koksma–Hlawka inequality|Koksma–Hlawka inequality]]. Empirically it allows the reduction of both estimation error and convergence time by an order of magnitude.<ref name="beating" /> Markov chain quasi-Monte Carlo methods<ref>{{cite journal |last1=Chen |first1=S. |first2=Josef |last2=Dick |first3=Art B. |last3=Owen |title=Consistency of Markov chain quasi-Monte Carlo on continuous state spaces |journal=[[Annals of Statistics]] |volume=39 |issue=2 |year=2011 |pages=673–701 |doi=10.1214/10-AOS831 |arxiv=1105.1896 |doi-access=free }}</ref><ref>{{cite thesis |last=Tribble |first=Seth D. |title=Markov chain Monte Carlo algorithms using completely uniformly distributed driving sequences |type=Diss. |publisher=Stanford University |year=2007 |id={{ProQuest|304808879}} }}</ref> such as the Array–RQMC method combine randomized quasi–Monte Carlo and Markov chain simulation by simulating <math>n</math> chains simultaneously in a way that better approximates the true distribution of the chain than with ordinary MCMC.<ref>{{cite journal |last1=L'Ecuyer |first1=P. |first2=C. |last2=Lécot |first3=B. |last3=Tuffin |title=A Randomized Quasi-Monte Carlo Simulation Method for Markov Chains |journal=[[Operations Research (journal)|Operations Research]] |volume=56 |issue=4 |year=2008 |pages=958–975 |doi=10.1287/opre.1080.0556 |url=https://hal.inria.fr/inria-00070462/file/RR-5545.pdf }}</ref> In empirical experiments, the variance of the average of a function of the state sometimes converges at rate <math>O(n^{-2})</math> or even faster, instead of the <math>O(n^{-1})</math> Monte Carlo rate.<ref>{{cite journal |last1=L'Ecuyer |first1=P. |first2=D. |last2=Munger |first3=C. |last3=Lécot |first4=B. |last4=Tuffin |title=Sorting Methods and Convergence Rates for Array-RQMC: Some Empirical Comparisons |journal=Mathematics and Computers in Simulation |volume=143 |year=2018 |pages=191–201 |doi=10.1016/j.matcom.2016.07.010 }}</ref>