Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Principle of maximum entropy
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Applications== The principle of maximum entropy is commonly applied in two ways to inferential problems: ===Prior probabilities=== The principle of maximum entropy is often used to obtain [[prior probability|prior probability distributions]] for [[Bayesian inference]]. Jaynes was a strong advocate of this approach, claiming the maximum entropy distribution represented the least informative distribution.<ref>{{cite journal |last=Jaynes |first=E. T. |author-link = Edwin Thompson Jaynes |year=1968 |url=http://bayes.wustl.edu/etj/articles/brandeis.pdf |title=Prior Probabilities |journal=IEEE Transactions on Systems Science and Cybernetics |volume=4 |issue=3 |pages=227β241 |doi=10.1109/TSSC.1968.300117 }}</ref> A large amount of literature is now dedicated to the elicitation of maximum entropy priors and links with [[channel coding]].<ref>{{cite journal |last=Clarke |first=B. |year=2006 |title=Information optimality and Bayesian modelling |journal=[[Journal of Econometrics]] |volume=138 |issue=2 |pages=405β429 |doi=10.1016/j.jeconom.2006.05.003 }}</ref><ref>{{cite journal |doi=10.2307/2669786 |last=Soofi |first=E.S. |year=2000 |title=Principal Information Theoretic Approaches |journal=[[Journal of the American Statistical Association]] |volume=95 |issue=452 |pages=1349β1353 |mr=1825292 |jstor=2669786 }}</ref><ref>{{cite journal |last=Bousquet |first=N. |year=2008 |title=Eliciting vague but proper maximal entropy priors in Bayesian experiments |journal=Statistical Papers |volume=51 |issue=3 |doi=10.1007/s00362-008-0149-9 |pages=613β628 |s2cid=119657859 }}</ref><ref>{{Cite journal|title = Objective priors from maximum entropy in data classification|journal = Information Fusion|date = 2013-04-01|pages = 186β198|volume = 14|issue = 2|doi = 10.1016/j.inffus.2012.01.012|first1 = Francesco A. N.|last1 = Palmieri|first2 = Domenico|last2 = Ciuonzo|citeseerx = 10.1.1.387.4515}}</ref> ===Posterior probabilities=== Maximum entropy is a sufficient updating rule for [[radical probabilism]]. [[Richard Jeffrey]]'s [[probability kinematics]] is a special case of [[maximum entropy inference]]. However, maximum entropy is not a generalisation of all such sufficient updating rules.<ref>{{ cite journal | last=Skyrms, B | author-link=Brian Skyrms | year=1987 | title=Updating, supposing and MAXENT | journal=Theory and Decision | volume=22 | issue=3 | pages=225β46| doi=10.1007/BF00134086 | s2cid=121847242 }}</ref> ===Maximum entropy models=== Alternatively, the principle is often invoked for model specification: in this case the observed data itself is assumed to be the testable information. Such models are widely used in [[natural language processing]]. An example of such a model is [[logistic regression]], which corresponds to the [[maximum entropy classifier]] for independent observations. ===Probability density estimation === One of the main applications of the maximum entropy principle is in discrete and continuous [[density estimation]].<ref name="BK08">{{cite journal |last1=Botev |first1=Z. I. | last2=Kroese | first2=D. P. |year=2008 |title=Non-asymptotic Bandwidth Selection for Density Estimation of Discrete Data |journal=Methodology and Computing in Applied Probability |volume=10 |issue=3 |pages=435 |doi=10.1007/s11009-007-9057-z |s2cid=122047337 }}</ref><ref name="BK11">{{cite journal |last1=Botev |first1=Z. I. | last2=Kroese | first2=D. P. |year=2011 |title=The Generalized Cross Entropy Method, with Applications to Probability Density Estimation |journal=Methodology and Computing in Applied Probability |volume=13 |issue=1 |pages=1β27 |doi=10.1007/s11009-009-9133-7 |s2cid=18155189 |url=http://espace.library.uq.edu.au/view/UQ:200564/UQ200564_preprint.pdf }}</ref> Similar to [[support vector machine]] estimators, the maximum entropy principle may require the solution to a [[quadratic programming]] problem, and thus provide a sparse mixture model as the optimal density estimator. One important advantage of the method is its ability to incorporate prior information in the density estimation.<ref>{{cite book |last1=Kesavan|first1=H. K. | last2=Kapur | first2=J. N. |year=1990 |contribution=Maximum Entropy and Minimum Cross-Entropy Principles |title=Maximum Entropy and Bayesian Methods |url=https://archive.org/details/maximumentropyba00jayn_552|url-access=limited| editor-last= FougΓ¨re | editor-first= P. F. |pages=[https://archive.org/details/maximumentropyba00jayn_552/page/n418 419]β432 |doi=10.1007/978-94-009-0683-9_29 |isbn=978-94-010-6792-8 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)