Editing Occam's razor (section)

== Justifications ==

=== Aesthetic ===
Prior to the 20th century, it was a commonly held belief that nature itself was simple and that simpler hypotheses about nature were thus more likely to be true. {{Clarify | text =This notion was deeply rooted in the aesthetic value that simplicity holds for human thought and the justifications presented for it often drew from [[theology]]. | date = February 2021 | reason = The example that follows doesn't connect with 'human thoughts' nor explicitly 'theology'.}} [[Thomas Aquinas]] made this argument in the 13th century, writing, "If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments [if] one suffices."<ref>Pegis 1945.</ref>

Beginning in the 20th century, [[epistemology|epistemological]] justifications based on [[Inductive reasoning|induction]], [[logic]], [[pragmatism]], and especially [[probability theory]] have become more popular among philosophers.<ref name="Sober 2015 4"/>

=== Empirical ===
Occam's razor has gained strong empirical support in helping to converge on better theories (see [[#Uses|Uses]] section below for some examples).

In the related concept of [[overfitting]], excessively complex models are affected by [[statistical noise]] (a problem also known as the [[bias–variance tradeoff]]), whereas simpler models may capture the underlying structure better and may thus have better [[predictive inference|predictive]] performance. It is, however, often difficult to deduce which part of the data is noise (cf. [[model selection]], [[test set]], [[minimum description length]], [[Bayesian inference]], etc.).

==== Testing the razor ====
{{Original research section|reason=Author of this section cites very few reliable sources, and also consistently conflates simplicity with (logical) truth. Occam's razor is not built to differentiate true hypotheses from false ones.|date=January 2023}}
The razor's statement that "other things being equal, simpler explanations are generally better than more complex ones" is amenable to empirical testing. Another interpretation of the razor's statement would be that "simpler hypotheses are generally better than the complex ones". The procedure to test the former interpretation would compare the track records of simple and comparatively complex explanations. If one accepts the first interpretation, the validity of Occam's razor as a tool would then have to be rejected if the more complex explanations were more often correct than the less complex ones (while the converse would lend support to its use). If the latter interpretation is accepted, the validity of Occam's razor as a tool could possibly be accepted if the simpler hypotheses led to correct conclusions more often than not.

Even if some increases in complexity are sometimes necessary, there still remains a justified general bias toward the simpler of two competing explanations. To understand why, consider that for each accepted explanation of a phenomenon, there is always an infinite number of possible, more complex, and ultimately incorrect, alternatives. This is so because one can always burden a failing explanation with an [[ad hoc hypothesis]]. Ad hoc hypotheses are justifications that prevent theories from being falsified.

[[File:Celtic Fairy Tales-1892-048-1.jpg|thumb|Possible explanations can become needlessly complex. It might be coherent, for instance, to add the involvement of [[leprechaun]]s to any explanation, but Occam's razor would prevent such additions unless they were necessary.]]

For example, if a man, accused of breaking a vase, makes [[supernatural]] claims that [[leprechauns]] were responsible for the breakage, a simple explanation might be that the man did it, but ongoing ad hoc justifications (e.g., "... and that's not me breaking it on the film; they tampered with that, too") could successfully prevent complete disproof. This endless supply of elaborate competing explanations, called saving hypotheses, cannot be technically ruled out – except by using Occam's razor.<ref name="Stanovich2007">Stanovich, Keith E. (2007). ''How to Think Straight About Psychology''. Boston: Pearson Education, pp. 19–33.</ref><ref>{{Cite web |url=http://skepdic.com/adhoc.html |title=ad hoc hypothesis - The Skeptic's Dictionary - Skepdic.com |website=skepdic.com |url-status=dead |archive-url=https://web.archive.org/web/20090427010136/http://www.skepdic.com/adhoc.html |archive-date=27 April 2009}}</ref><ref>Swinburne 1997 and Williams, Gareth T, 2008.</ref>

Any more complex theory might still possibly be true. A study of the predictive validity of Occam's razor found 32 published papers that included 97 comparisons of economic forecasts from simple and complex forecasting methods. None of the papers provided a balance of evidence that complexity of method improved forecast accuracy. In the 25 papers with quantitative comparisons, complexity increased forecast errors by an average of 27 percent.<ref>{{Cite journal |last1=Green |first1=K. C. |last2=Armstrong |first2=J. S. |year=2015 |title=Simple versus complex forecasting: The evidence |url=https://repository.upenn.edu/marketing_papers/366 |journal=Journal of Business Research |volume=68 |issue=8 |pages=1678–1685 |doi=10.1016/j.jbusres.2015.03.026 |access-date=22 January 2019 |archive-date=8 June 2020 |archive-url=https://web.archive.org/web/20200608134337/https://repository.upenn.edu/marketing_papers/366/ |url-status=live }}{{subscription required}}</ref>

=== Practical considerations and pragmatism ===
{{See also|Pragmatism|Problem of induction}}

=== Mathematical ===
{{Main| Akaike information criterion}}

One justification of Occam's razor is a direct result of basic [[probability theory]]. By definition, all assumptions introduce possibilities for error; if an assumption does not improve the accuracy of a theory, its only effect is to increase the probability that the overall theory is wrong.

There have also been other attempts to derive Occam's razor from probability theory, including notable attempts made by [[Harold Jeffreys]] and [[Edwin Thompson Jaynes|E. T. Jaynes]]. The probabilistic (Bayesian) basis for Occam's razor is elaborated by [[David J. C. MacKay]] in chapter 28 of his book ''Information Theory, Inference, and Learning Algorithms'',<ref>{{Cite book |url=http://www.inference.phy.cam.ac.uk/itprnn/book.pdf |title=Information Theory, Inference, and Learning Algorithms |last=MacKay |first=David J. C. |year=2003 |bibcode=2003itil.book.....M |archive-url=https://web.archive.org/web/20120915043535/http://www.inference.phy.cam.ac.uk/itprnn/book.pdf |archive-date=15 September 2012 |url-status=live }}</ref> where he emphasizes that a prior bias in favor of simpler models is not required.

[[William H. Jefferys]] and [[James Berger (statistician)|James O. Berger]] (1991) generalize and quantify the original formulation's "assumptions" concept as the degree to which a proposition is unnecessarily accommodating to possible observable data.<ref name="Jefferys">{{Cite journal |last1=Jefferys |first1=William H. |last2=Berger |first2=James O. |year=1991 |title=Ockham's Razor and Bayesian Statistics |url=http://quasar.as.utexas.edu/papers/ockham.pdf |url-status=live |journal=[[American Scientist]] |volume=80 |issue=1 |pages=64–72 |jstor=29774559 |archive-url=https://web.archive.org/web/20050304065538/http://quasar.as.utexas.edu/papers/ockham.pdf |archive-date=4 March 2005}} (preprint available as "Sharpening Occam's Razor on a Bayesian Strop").</ref> They state, "A hypothesis with fewer adjustable parameters will automatically have an enhanced posterior probability, due to the fact that the predictions it makes are sharp."<ref name="Jefferys" /> The use of "sharp" here is not only a tongue-in-cheek reference to the idea of a razor, but also indicates that such predictions are more [[Accuracy and precision|accurate]] than competing predictions. The model they propose balances the precision of a theory's predictions against their sharpness, preferring theories that sharply make correct predictions over theories that accommodate a wide range of other possible results. This, again, reflects the mathematical relationship between key concepts in [[Bayesian inference]] (namely [[marginal probability]], [[conditional probability]], and [[posterior probability]]).

The [[bias–variance tradeoff]] is a framework that incorporates the Occam's razor principle in its balance between overfitting (associated with lower bias but higher variance) and underfitting (associated with lower variance but higher bias).<ref>{{Cite book |title=An Introduction to Statistical Learning|last1=James |first1=Gareth |last2=Witten |first2 = Daniela |last3 = Hastie |first3 = Trevor|last4 = Tibshirani |first4 = Robert |display-authors = 1| date=2013 |publisher=springer |isbn=9781461471370 |pages=105, 203–204}}</ref>

=== Other philosophers ===

==== Karl Popper ====
[[Karl Popper]] argues that a preference for simple theories need not appeal to practical or aesthetic considerations. Our preference for simplicity may be justified by its [[falsifiability]] criterion: we prefer simpler theories to more complex ones "because their empirical content is greater; and because they are better testable".<ref>{{cite book |last=Popper |first=Karl |author-link=Karl Popper |orig-year=1934 |year=1992 |title=Logik der Forschung |trans-title=The Logic of Scientific Discovery |edition=2nd |location=London |publisher=Routledge |pages=121–132 |isbn=978-84-309-0711-3 }}</ref> The idea here is that a simple theory applies to more cases than a more complex one, and is thus more easily falsifiable. This is again comparing a simple theory to a more complex theory where both explain the data equally well.

==== Elliott Sober ====
The philosopher of science [[Elliott Sober]] once argued along the same lines as Popper, tying simplicity with "informativeness": The simplest theory is the more informative, in the sense that it requires less information to a question.<ref name="Sober975">{{Cite book |url=https://archive.org/details/simplicity0000sobe |title=Simplicity |last=Sober |first=Elliott |publisher=[[Clarendon Press]] |year=1975 |isbn=978-0-19-824407-3 |location=Oxford |author-link=Elliott Sober |url-access=registration}}</ref> He has since rejected this account of simplicity, purportedly because it fails to provide an [[epistemology|epistemic]] justification for simplicity. He now believes that simplicity considerations (and considerations of parsimony in particular) do not count unless they reflect something more fundamental. Philosophers, he suggests, may have made the error of hypostatizing simplicity (i.e., endowed it with a ''[[sui generis]]'' existence), when it has meaning only when embedded in a specific context (Sober 1992). If we fail to justify simplicity considerations on the basis of the context in which we use them, we may have no non-circular justification: "Just as the question 'why be rational?' may have no non-circular answer, the same may be true of the question 'why should simplicity be considered in evaluating the plausibility of hypotheses?{{'"}}<ref name="Sober2002">{{Cite book |url=https://books.google.com/books?id=-YdbBN-O-JAC&q=zellner+simplicity |title=Simplicity, Inference and Modeling: Keeping it Sophisticatedly Simple |last=Sober |first=Elliott |publisher=Cambridge University Press |year=2004 |isbn=978-0-521-80361-8 |editor-last=Zellner |editor-first=Arnold |location=Cambridge, U.K. |pages=13–31 |chapter=What is the Problem of Simplicity? |access-date=4 August 2012 |editor-last2=Keuzenkamp |editor-first2=Hugo A. |editor-link2=Hugo A. Keuzenkamp |editor-last3=McAleer |editor-first3=Michael |chapter-url=https://books.google.com/books?id=J_CDXu24qZUC&q=sober+rival+hypotheses&pg=RA1-PA13 |archive-date=28 October 2023 |archive-url=https://web.archive.org/web/20231028141247/https://books.google.com/books?id=-YdbBN-O-JAC&q=zellner+simplicity#v=snippet&q=zellner%20simplicity&f=false |url-status=live }} [https://web.archive.org/web/20060901082031/http://philosophy.wisc.edu/sober/TILBURG.pdf Paper as PDF.]</ref>

==== Richard Swinburne ====

[[Richard Swinburne]] argues for simplicity on logical grounds:

{{blockquote|... the simplest hypothesis proposed as an explanation of phenomena is more likely to be the true one than is any other available hypothesis, that its predictions are more likely to be true than those of any other available hypothesis, and that it is an ultimate ''a priori'' epistemic principle that simplicity is evidence for truth.|Swinburne 1997}}

According to Swinburne, since our choice of theory cannot be determined by data (see [[Underdetermination]] and [[Duhem–Quine thesis]]), we must rely on some criterion to determine which theory to use. Since it is absurd to have no logical method for settling on one hypothesis amongst an infinite number of equally data-compliant hypotheses, we should choose the simplest theory: "Either science is irrational [in the way it judges theories and predictions probable] or the principle of simplicity is a fundamental synthetic a priori truth."<ref>Swinburne, Richard (1997). Simplicity as Evidence for Truth. Milwaukee, Wisconsin: Marquette University Press. {{ISBN|978-0-87462-164-8}}.</ref>

==== Ludwig Wittgenstein ====
From the ''[[Tractatus Logico-Philosophicus]]'':
* 3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor."
: (If everything in the symbolism works as though a sign had meaning, then it has meaning.)
* 4.04 "In the proposition, there must be exactly as many things distinguishable as there are in the state of affairs, which it represents. They must both possess the same logical (mathematical) multiplicity (cf. Hertz's Mechanics, on Dynamic Models)."
* 5.47321 "Occam's Razor is, of course, not an arbitrary rule nor one justified by its practical success. It simply says that unnecessary elements in a symbolism mean nothing. Signs which serve one purpose are logically equivalent; signs which serve no purpose are logically meaningless."
and on the related concept of "simplicity":
* 6.363 "The procedure of induction consists in accepting as true the simplest law that can be reconciled with our experiences."