Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Expectation–maximization algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Properties == Although an EM iteration does increase the observed data (i.e., marginal) likelihood function, no guarantee exists that the sequence converges to a [[maximum likelihood estimator]]. For [[bimodal distribution|multimodal distributions]], this means that an EM algorithm may converge to a [[local maximum]] of the observed data likelihood function, depending on starting values. A variety of heuristic or [[metaheuristic]] approaches exist to escape a local maximum, such as random-restart [[hill climbing]] (starting with several different random initial estimates <math>\boldsymbol\theta^{(t)}</math>), or applying [[simulated annealing]] methods. EM is especially useful when the likelihood is an [[exponential family]], see Sundberg (2019, Ch. 8) for a comprehensive treatment:<ref>{{cite book |last1=Sundberg |first1=Rolf |title=Statistical Modelling by Exponential Families |date=2019 |publisher=Cambridge University Press |isbn=9781108701112}}</ref> the E step becomes the sum of expectations of [[sufficient statistic]]s, and the M step involves maximizing a linear function. In such a case, it is usually possible to derive [[closed-form expression]] updates for each step, using the Sundberg formula<ref>{{cite book |last1=Laird |first1=Nan |title=Encyclopedia of Statistical Sciences |chapter=Sundberg formulas |chapter-url=https://doi.org/10.1002/0471667196.ess2643.pub2 |publisher=Wiley |date=2006|doi=10.1002/0471667196.ess2643.pub2 |isbn=0471667196 }}</ref> (proved and published by Rolf Sundberg, based on unpublished results of [[Per Martin-Löf]] and [[Anders Martin-Löf]]).<ref name="Sundberg1971"/><ref name="Sundberg1976"/><ref name="Martin-Löf1966"/><ref name="Martin-Löf1970"/><ref name="Martin-Löf1974a"/><ref name="Martin-Löf1974b"/> The EM method was modified to compute [[maximum a posteriori]] (MAP) estimates for [[Bayesian inference]] in the original paper by Dempster, Laird, and Rubin. Other methods exist to find maximum likelihood estimates, such as [[gradient descent]], [[conjugate gradient]], or variants of the [[Gauss–Newton algorithm]]. Unlike EM, such methods typically require the evaluation of first and/or second derivatives of the likelihood function.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)