Editing Occam's razor (section)

==== Objective razor ====
The minimum instruction set of a [[universal Turing machine]] requires approximately the same length description across different formulations, and is small compared to the [[Kolmogorov complexity]] of most practical theories. [[Marcus Hutter]] has used this consistency to define a "natural" Turing machine of small size as the proper basis for excluding arbitrarily complex instruction sets in the formulation of razors.<ref>{{Cite web |url=http://www.hutter1.net/ait.htm |title=Algorithmic Information Theory |url-status=live |archive-url=https://web.archive.org/web/20071224043538/http://www.hutter1.net/ait.htm |archive-date=24 December 2007}}</ref> Describing the program for the universal program as the "hypothesis", and the representation of the evidence as program data, it has been formally proven under [[Zermelo–Fraenkel set theory]] that "the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized."<ref>{{Cite journal |last1=Vitanyi |first1=P.M.B. |last2=Ming Li |date=March 2000 |title=Minimum description length induction, Bayesianism, and Kolmogorov complexity |url=https://ieeexplore.ieee.org/document/825807 |journal=IEEE Transactions on Information Theory |volume=46 |issue=2 |pages=446–464 |doi=10.1109/18.825807|arxiv=cs/9901014 }}</ref> Interpreting this as minimising the total length of a two-part message encoding model followed by data given model gives us the [[minimum message length]] (MML) principle.<ref name="ReferenceC" /><ref name="auto" />

One possible conclusion from mixing the concepts of Kolmogorov complexity and Occam's razor is that an ideal data compressor would also be a scientific explanation/formulation generator. Some attempts have been made to re-derive known laws from considerations of simplicity or compressibility.<ref name="ReferenceB" /><ref>{{Cite journal |last=Standish |first=Russell K |year=2000 |title=Why Occam's Razor |journal=Foundations of Physics Letters |volume=17 |issue=3 |pages=255–266 |arxiv=physics/0001020 |bibcode=2004FoPhL..17..255S |doi=10.1023/B:FOPL.0000032475.18334.0e|s2cid=17143230 }}</ref>

According to [[Jürgen Schmidhuber]], the appropriate mathematical theory of Occam's razor already exists, namely, [[Ray Solomonoff|Solomonoff's]] [[Solomonoff's theory of inductive inference|theory of optimal inductive inference]]<ref>{{Cite journal |last=Solomonoff |first=Ray |author-link=Ray Solomonoff |year=1964 |title=A formal theory of inductive inference. Part I. |journal=Information and Control |volume=7 |issue=1–22 |page=1964 |doi=10.1016/s0019-9958(64)90223-2|doi-access=free }}</ref> and its extensions.<ref>{{Cite book |title=Artificial General Intelligence |last=Schmidhuber |first=J. |year=2006 |editor-last=Goertzel |editor-first=B. |pages=177–200 |chapter=The New AI: General & Sound & Relevant for Physics |arxiv=cs.AI/0302012 |author-link=Jürgen Schmidhuber |editor-last2=Pennachin |editor-first2=C.}}</ref> See discussions in David L. Dowe's "Foreword re C. S. Wallace"<ref>{{Cite journal |last=Dowe |first=David L. |year=2008 |title=Foreword re C. S. Wallace |journal=Computer Journal |volume=51 |issue=5 |pages=523–560 |doi=10.1093/comjnl/bxm117|s2cid=5387092 }}</ref> for the subtle distinctions between the [[algorithmic probability]] work of Solomonoff and the MML work of [[Chris Wallace (computer scientist)|Chris Wallace]], and see Dowe's "MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness"<ref>David L. Dowe (2010): "MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness. A formal theory of inductive inference." ''Handbook of the Philosophy of Science''{{spaced ndash}}(HPS Volume 7) Philosophy of Statistics, Elsevier 2010 Page(s):901–982. https://web.archive.org/web/20140204001435/http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.709&rep=rep1&type=pdf</ref> both for such discussions and for (in section 4) discussions of MML and Occam's razor. For a specific example of MML as Occam's razor in the problem of decision tree induction, see Dowe and Needham's "Message Length as an Effective Ockham's Razor in Decision Tree Induction".<ref>Scott Needham and David L. Dowe (2001):" Message Length as an Effective Ockham's Razor in Decision Tree Induction." Proc. 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS 2001), Key West, Florida, U.S.A., January 2001 Page(s): 253–260 {{Cite web |url=http://www.csse.monash.edu.au/~dld/Publications/2001/Needham+Dowe2001_Ockham.pdf |title=2001 Ockham.pdf |url-status=live |archive-url=https://web.archive.org/web/20150923211645/http://www.csse.monash.edu.au/~dld/Publications/2001/Needham+Dowe2001_Ockham.pdf |archive-date=23 September 2015 |access-date=2 September 2015}}</ref>