Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Occam's razor
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Uses == {{original research |section|date=May 2021}} === Science and the scientific method === [[File:Heliocentric.jpg|thumb|250px|right|[[Andreas Cellarius]]'s illustration of the Copernican system, from the ''[[Harmonia Macrocosmica]]'' (1660). Future positions of the sun, moon and other solar system bodies can be calculated using a [[Geocentrism|geocentric]] model (the earth is at the centre) or using a [[Heliocentrism#Modern science|heliocentric model]] (the sun is at the centre). Both work, but the geocentric model requires a much more complex system of calculations than the heliocentric model. This was pointed out in a preface to [[Copernicus]]'s first edition of ''[[De revolutionibus orbium coelestium]]''.]] In [[science]], Occam's razor is used as a [[heuristic]] to guide scientists in developing theoretical models rather than as an arbiter between published models.<ref name="fn_(100)" /><ref name="fn_(101)" /> In [[physics]], parsimony was an important heuristic in the development and application of the [[principle of least action]] by [[Pierre Louis Maupertuis]] and [[Leonhard Euler]],<ref name="fn_(104)">{{Cite book |title=Mémoires de l'Académie Royale |last=de Maupertuis |first=P. L. M. |year=1744 |page=423 |language=fr}}</ref> in [[Albert Einstein]]'s formulation of [[special relativity]],<ref name="fn_(102)">{{Cite journal |last=Einstein |first=Albert |author-link=Albert Einstein |year=1905 |title=Does the Inertia of a Body Depend Upon Its Energy Content? |url=https://zenodo.org/record/1424057 |journal=Annalen der Physik |language=de |issue=18 |pages=639–41 |doi=10.1002/andp.19053231314 |bibcode=1905AnP...323..639E |volume=323 |doi-access=free |access-date=21 October 2019 |archive-date=21 October 2019 |archive-url=https://web.archive.org/web/20191021050723/https://zenodo.org/record/1424057 |url-status=live }}</ref><ref name="fn_(103)">L. Nash, The Nature of the Natural Sciences, Boston: Little, Brown (1963).</ref> and in the development of [[quantum mechanics]] by [[Max Planck]], [[Werner Heisenberg]] and [[Louis de Broglie]].<ref name="fn_(101)" /><ref name="fn_(105)">{{Cite book |title=Annales de Physique |last=de Broglie |first=L. |year=1925 |pages=22–128 |language=fr |issue=3/10}}</ref> In [[chemistry]], Occam's razor is often an important heuristic when developing a model of a [[reaction mechanism]].<ref name="fn_(107)">RA Jackson, Mechanism: An Introduction to the Study of Organic Reactions, Clarendon, Oxford, 1972.</ref><ref name="fn_(108)">Carpenter, B. K. (1984). ''Determination of Organic Reaction Mechanism'', New York: Wiley-Interscience.</ref> Although it is useful as a heuristic in developing models of reaction mechanisms, it has been shown to fail as a criterion for selecting among some selected published models.<ref name="fn_(101)" /> In this context, Einstein himself expressed caution when he formulated Einstein's [[Constraint counting|Constraint]]: "It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience."<ref>{{Cite journal |last=Einstein |first=Albert |date=1934 |title=On the Method of Theoretical Physics |url=https://www.jstor.org/stable/184387 |journal=Philosophy of Science |volume=1 |issue=2 |pages=165 [163–169] |doi=10.1086/286316 |jstor=184387 |s2cid=44787169 |access-date=22 January 2023 |archive-date=22 January 2023 |archive-url=https://web.archive.org/web/20230122233537/https://www.jstor.org/stable/184387 |url-status=live }}</ref><ref>{{Cite book |last=Mettenheim |first=Christoph von |url=https://books.google.com/books?id=hLSR2or-bGAC |title=Popper Versus Einstein: On the Philosophical Foundations of Physics |date=1998 |publisher=Mohr Siebeck |isbn=978-3-16-146910-7 |page=34 |language=en |access-date=22 January 2023 |archive-date=22 January 2023 |archive-url=https://web.archive.org/web/20230122233538/https://books.google.com/books?id=hLSR2or-bGAC |url-status=live }}</ref><ref>{{Cite book |last1=Geis |first1=Gilbert |url=https://books.google.com/books?id=xdbQMywnrdwC&dq=%22the+supreme+goal+of+all+theory+is+to+make+the+irreducible+basic+elements+as+simple+and+as+few+as+possible+without+having+to+surrender+the+adequate+representation+of+a+single+datum+of%22&pg=PA39 |title=Crimes of the Century: From Leopold and Loeb to O.J. Simpson |last2=Geis |first2=Professor Emeritus of Criminology Law and & Society Gilbert |last3=Bienen |first3=Leigh B. |date=1998 |publisher=UPNE |isbn=978-1-55553-360-1 |page=39 |language=en |access-date=10 February 2023 |archive-date=5 April 2023 |archive-url=https://web.archive.org/web/20230405182025/https://books.google.com/books?id=xdbQMywnrdwC&dq=%22the+supreme+goal+of+all+theory+is+to+make+the+irreducible+basic+elements+as+simple+and+as+few+as+possible+without+having+to+surrender+the+adequate+representation+of+a+single+datum+of%22&pg=PA39 |url-status=live }}</ref> An often-quoted version of this constraint (which cannot be verified as posited by Einstein himself)<ref>{{Cite web |url=http://quoteinvestigator.com/2011/05/13/einstein-simple/ |title=Everything Should Be Made as Simple as Possible, But Not Simpler |date=13 May 2011 |url-status=live |archive-url=https://web.archive.org/web/20120529075018/http://quoteinvestigator.com/2011/05/13/einstein-simple/ |archive-date=29 May 2012}}</ref> reduces this to "Everything should be kept as simple as possible, but not simpler." In the [[scientific method]], Occam's razor is not considered an irrefutable principle of [[logic]] or a scientific result; the preference for simplicity in the scientific method is based on the [[falsifiability]] criterion. For each accepted explanation of a phenomenon, there may be an extremely large, perhaps even incomprehensible, number of possible and more complex alternatives. Since failing explanations can always be burdened with [[Ad hoc hypothesis|''ad hoc'' hypotheses]] to prevent them from being falsified, simpler theories are preferable to more complex ones because they tend to be more [[test method|testable]].<ref name="fn_(109)">{{Cite book |last=Alan Baker |title=Stanford Encyclopedia of Philosophy |publisher=Stanford University |year=2010 |location=California |chapter=Simplicity |chapter-url=http://plato.stanford.edu/entries/simplicity/ |orig-year=2004 |access-date=22 January 2005 |archive-date=26 March 2014 |archive-url=https://web.archive.org/web/20140326180129/http://plato.stanford.edu/entries/simplicity/ |url-status=live }}</ref><ref name="fn_(110)">{{Cite journal |last1=Courtney |first1=A. |last2=Courtney |first2=M. |year=2008 |title=Comments Regarding 'On the Nature of Science' |journal=Physics in Canada |volume=64 |issue=3 |pages=7–8 |arxiv=0812.4932 |bibcode=2008arXiv0812.4932C}}</ref><ref name="fn_(114)">{{Cite book |last=Sober |first=Elliott |title=Explanation and Its Limits |publisher=Cambridge University Press |year=1994 |editor-last=Knowles |editor-first=Dudley |pages=73–93 |chapter=Let's Razor Occam's Razor}}</ref> As a logical principle, Occam's razor would demand that scientists accept the simplest possible theoretical explanation for existing data. However, science has shown repeatedly that future data often support more complex theories than do existing data. Science prefers the simplest explanation that is consistent with the data available at a given time, but the simplest explanation may be ruled out as new data become available.<ref name="fn_(100)" /><ref name="fn_(110)" /> That is, science is open to the possibility that future experiments might support more complex theories than demanded by current data and is more interested in designing experiments to discriminate between competing theories than favoring one theory over another based merely on philosophical principles.<ref name="fn_(109)" /><ref name="fn_(110)" /><ref name="fn_(114)" /> When scientists use the idea of parsimony, it has meaning only in a very specific context of inquiry. Several background assumptions are required for parsimony to connect with plausibility in a particular research problem.{{Clarify | date = February 2021 | reason = This sentence is so vague/abstract that it seems to add very little to the discussion. For example, try dropping it, there doesn't seem to be much lost.}} The reasonableness of parsimony in one research context may have nothing to do with its reasonableness in another. It is a mistake to think that there is a single global principle that spans diverse subject matter.<ref name="fn_(114)" /> It has been suggested that Occam's razor is a widely accepted example of extraevidential consideration, even though it is entirely a metaphysical assumption. Most of the time, however, Occam's razor is a conservative tool, cutting out "crazy, complicated constructions" and assuring "that hypotheses are grounded in the science of the day", thus yielding "normal" science: models of explanation and prediction.<ref name="fn_(101)" /> There are, however, notable exceptions where Occam's razor turns a conservative scientist into a reluctant revolutionary. For example, [[Max Planck]] interpolated between the [[Wien approximation|Wien]] and [[Rayleigh–Jeans law|Jeans]] radiation laws and used Occam's razor logic to formulate the quantum hypothesis, even resisting that hypothesis as it became more obvious that it was correct.<ref name="fn_(101)" /> Appeals to simplicity were used to argue against the phenomena of meteorites, [[ball lightning]], [[continental drift]], and [[reverse transcriptase]].<ref>{{Cite journal |last1=Rabinowitz |first1=Matthew |last2=Myers |first2=Lance |last3=Banjevic |first3=Milena |last4=Chan |first4=Albert |last5=Sweetkind-Singer |first5=Joshua |last6=Haberer |first6=Jessica |last7=McCann |first7=Kelly |last8=Wolkowicz |first8=Roland |date=1 March 2006 |title=Accurate prediction of HIV-1 drug response from the reverse transcriptase and protease amino acid sequences using sparse models created by convex optimization |journal=Bioinformatics |language=en |volume=22 |issue=5 |pages=541–549 |doi=10.1093/bioinformatics/btk011 |pmid=16368772|doi-access=free }}</ref> One can argue for atomic building blocks for matter, because it provides a simpler explanation for the observed reversibility of both {{Clarify | text = mixing | date = February 2021 | reason = Mixing of what? }} and chemical reactions as simple separation and rearrangements of atomic building blocks. At the time, however, the [[atomic theory]] was considered more complex because it implied the existence of invisible particles that had not been directly detected. [[Ernst Mach]] and the logical positivists rejected [[John Dalton]]'s [[atomic theory]] until the reality of atoms was more evident in [[Brownian motion]], as shown by [[Albert Einstein]].<ref name="Pojman2009">{{Cite book |title=The Stanford Encyclopedia of Philosophy |last=Paul Pojman |publisher=Stanford University |year=2009 |location=California |chapter=Ernst Mach |chapter-url=http://plato.stanford.edu/entries/ernst-mach/ |access-date=4 October 2009 |archive-date=11 November 2020 |archive-url=https://web.archive.org/web/20201111231039/https://plato.stanford.edu/entries/ernst-mach/ |url-status=live }}</ref> In the same way, postulating the [[Luminiferous aether|aether]] is more complex than transmission of light through a [[vacuum]]. At the time, however, all known waves propagated through a physical medium, and it seemed simpler to postulate the existence of a medium than to theorize about wave propagation without a medium. Likewise, [[Isaac Newton]]'s idea of light particles seemed simpler than [[Christiaan Huygens]]'s idea of waves, so many favored it. In this case, as it turned out, neither the wave—nor the particle—explanation alone suffices, as [[wave–particle duality|light behaves like waves and like particles]]. Three axioms presupposed by the scientific method are realism (the existence of objective reality), the existence of natural laws, and the constancy of natural law. Rather than depend on provability of these axioms, science depends on the fact that they have not been objectively falsified. Occam's razor and parsimony support, but do not prove, these axioms of science. The general principle of science is that theories (or models) of natural law must be consistent with repeatable experimental observations. This ultimate arbiter (selection criterion) rests upon the axioms mentioned above.<ref name="fn_(110)" /> If multiple models of natural law make exactly the same testable predictions, they are equivalent and there is no need for parsimony to choose a preferred one. For example, [[Newtonian mechanics|Newtonian]], [[Hamiltonian mechanics|Hamiltonian]] and [[Lagrangian mechanics|Lagrangian]] classical mechanics are equivalent. Physicists have no interest in using Occam's razor to say the other two are wrong. Likewise, there is no demand for simplicity principles to arbitrate between wave and matrix formulations of quantum mechanics. Science often does not demand arbitration or selection criteria between models that make the same testable predictions.<ref name="fn_(110)" /> === Biology === {{Citation style|date=January 2023|section}} Biologists or philosophers of biology use Occam's razor in either of two contexts both in [[evolution|evolutionary biology]]: the units of selection controversy and [[systematics]]. [[George C. Williams (biologist)|George C. Williams]] in his book ''[[Adaptation and Natural Selection]]'' (1966) argues that the best way to explain [[altruism]] among animals is based on low-level (i.e., individual) selection as opposed to high-level group selection. Altruism is defined by some evolutionary biologists (e.g., R. Alexander, 1987; W. D. Hamilton, 1964) as behavior that is beneficial to others (or to the group) at a cost to the individual, and many posit individual selection as the mechanism that explains altruism solely in terms of the behaviors of individual organisms acting in their own self-interest (or in the interest of their genes, via kin selection). Williams was arguing against the perspective of others who propose selection at the level of the group as an evolutionary mechanism that selects for altruistic traits (e.g., D. S. Wilson & E. O. Wilson, 2007). The basis for Williams's contention is that of the two, individual selection is the more parsimonious theory. In doing so he is invoking a variant of Occam's razor known as [[Morgan's Canon]]: "In no case is an animal activity to be interpreted in terms of higher psychological processes, if it can be fairly interpreted in terms of processes which stand lower in the scale of psychological evolution and development." (Morgan 1903). However, more recent biological analyses, such as [[Richard Dawkins]]'s ''[[The Selfish Gene]]'', have contended that Morgan's Canon is not the simplest and most basic explanation. Dawkins argues the way evolution works is that the genes propagated in most copies end up determining the development of that particular species, i.e., natural selection turns out to select specific genes, and this is really the fundamental underlying principle that automatically gives individual and group selection as [[Emergent evolution|emergent]] features of evolution. [[Zoology]] provides an example. [[Muskox]]en, when threatened by [[Gray wolf|wolves]], form a circle with the males on the outside and the females and young on the inside. This is an example of a behavior by the males that seems to be altruistic. The behavior is disadvantageous to them individually but beneficial to the group as a whole; thus, it was seen by some to support the group selection theory. Another interpretation is kin selection: if the males are protecting their offspring, they are protecting copies of their own alleles. Engaging in this behavior would be favored by individual selection if the cost to the male musk ox is less than half of the benefit received by his calf – which could easily be the case if wolves have an easier time killing calves than adult males. It could also be the case that male musk oxen would be individually less likely to be killed by wolves if they stood in a circle with their horns pointing out, regardless of whether they were protecting the females and offspring. That would be an example of regular natural selection – a phenomenon called "the selfish herd". [[Systematics]] is the branch of [[biology]] that attempts to establish patterns of relationship among biological taxa, today generally thought to reflect evolutionary history. It is also concerned with their classification. There are three primary camps in systematics: cladists, pheneticists, and evolutionary taxonomists. Cladists hold that classification should be based on [[synapomorphies]] (shared, derived character states), pheneticists contend that overall similarity (synapomorphies and complementary [[symplesiomorphies]]) is the determining criterion, while evolutionary taxonomists say that both genealogy and similarity count in classification (in a manner determined by the evolutionary taxonomist).<ref>{{Cite book |title=Reconstructing the Past: Parsimony, Evolution, and Inference |last=Sober |first=Elliot |date=1998 |publisher=The MIT Press |isbn=978-0-262-69144-4 |edition=2nd |location=Massachusetts Institute of Technology |page=7}}</ref><ref>{{Cite book |title=Phylogenetics: the theory and practice of phylogenetic systematics |last=Wiley |first=Edward O. |date=2011 |edition=2nd |publisher=Wiley-Blackwell |isbn=978-0-470-90596-8}}</ref> It is among the cladists that Occam's razor is applied, through the method of ''cladistic parsimony''. Cladistic parsimony (or [[maximum parsimony]]) is a method of phylogenetic inference that yields [[phylogenetic tree]]s (more specifically, cladograms). [[Cladistics|Cladograms]] are branching, diagrams used to represent hypotheses of relative degree of relationship, based on [[synapomorphies]]. Cladistic parsimony is used to select as the preferred hypothesis of relationships the cladogram that requires the fewest implied character state transformations (or smallest weight, if characters are differentially weighted). Critics of the cladistic approach often observe that for some types of data, parsimony could produce the wrong results, regardless of how much data is collected (this is called statistical inconsistency, or [[long branch attraction]]). However, this criticism is also potentially true for any type of phylogenetic inference, unless the model used to estimate the tree reflects the way that evolution actually happened. Because this information is not empirically accessible, the criticism of statistical inconsistency against parsimony holds no force.<ref>{{cite journal | last1 = Brower | first1 = AVZ | year = 2017 | title = Statistical consistency and phylogenetic inference: a brief review | journal = Cladistics | volume = 34| issue = 5| pages = 562–567| doi = 10.1111/cla.12216 | pmid = 34649374 | doi-access = free }}</ref> For a book-length treatment of cladistic parsimony, see [[Elliott Sober]]'s ''Reconstructing the Past: Parsimony, Evolution, and Inference'' (1988). For a discussion of both uses of Occam's razor in biology, see Sober's article "Let's Razor Ockham's Razor" (1990). Other methods for inferring evolutionary relationships use parsimony in a more general way. [[Likelihood function|Likelihood]] methods for phylogeny use parsimony as they do for all likelihood tests, with hypotheses requiring fewer differing parameters (i.e., numbers or different rates of character change or different frequencies of character state transitions) being treated as null hypotheses relative to hypotheses requiring more differing parameters. Thus, complex hypotheses must predict data much better than do simple hypotheses before researchers reject the simple hypotheses. Recent advances employ [[information theory]], a close cousin of likelihood, which uses Occam's razor in the same way. The choice of the "shortest tree" relative to a not-so-short tree under any optimality criterion (smallest distance, fewest steps, or maximum likelihood) is always based on parsimony.<ref>{{cite book |title =Biological Systematics: Principles and Applications (3rd edn.) |last=Brower & |first=Schuh |date=2021 |publisher=Cornell University Press}}</ref> [[Francis Crick]] has commented on potential limitations of Occam's razor in biology. He advances the argument that because biological systems are the products of (an ongoing) natural selection, the mechanisms are not necessarily optimal in an obvious sense. He cautions: "While Ockham's razor is a useful tool in the physical sciences, it can be a very dangerous implement in biology. It is thus very rash to use simplicity and elegance as a guide in biological research."<ref>Crick 1988, p. 146.</ref> This is an ontological critique of parsimony. In [[biogeography]], parsimony is used to infer ancient vicariant events or [[Historical migration|migrations]] of [[species]] or [[population]]s by observing the geographic distribution and relationships of existing [[organism]]s. Given the phylogenetic tree, ancestral population subdivisions are inferred to be those that require the minimum amount of change.{{citation needed|date=March 2024}} === Religion === {{Main| Existence of God}} In the [[philosophy of religion]], Occam's razor is sometimes applied to the existence of God. William of Ockham himself was a [[Christianity|Christian]]. He believed in God, and in the [[Biblical authority|authority]] of [[Christian scripture]]; he writes that "nothing ought to be posited without a reason given, unless it is self-evident (literally, known through itself) or known by experience or proved by the authority of Sacred Scripture."<ref>{{Cite encyclopedia |title=Encyclopedia of Philosophy |publisher=Stanford |access-date=24 February 2016 |contribution-url=http://plato.stanford.edu/entries/ockham/ |contribution=William Ockham |archive-date=7 October 2019 |archive-url=https://web.archive.org/web/20191007132502/https://plato.stanford.edu/entries/ockham/ |url-status=live }}</ref> Ockham believed that an explanation has no sufficient basis in reality when it does not harmonize with reason, experience, or the [[Christian Bible|Bible]]. Unlike many theologians of his time, though, Ockham did not believe God could be logically proven with arguments. To Ockham, science was a matter of discovery; [[theology]] was a matter of [[revelation]] and [[faith]]. He states: "Only faith gives us access to theological truths. The ways of God are not open to reason, for God has freely chosen to create a world and establish a way of salvation within it apart from any necessary laws that human logic or rationality can uncover."<ref>Dale T Irvin & Scott W Sunquist. ''History of World Christian Movement Volume, I: Earliest Christianity to 1453'', p. 434. {{ISBN|9781570753961}}.</ref> [[Thomas Aquinas]], in the ''[[Summa Theologica]]'', uses a formulation of Occam's razor to construct an objection to the idea that God exists, which he refutes directly with a counterargument:<ref>{{Cite web |url=http://www.newadvent.org/summa/1002.htm |title=SUMMA THEOLOGICA: The existence of God (Prima Pars, Q. 2) |publisher=Newadvent.org |url-status=live |archive-url=https://web.archive.org/web/20130428053715/http://www.newadvent.org/summa/1002.htm |archive-date=28 April 2013 |access-date=26 March 2013}}</ref> <blockquote>Further, it is superfluous to suppose that what can be accounted for by a few principles has been produced by many. But it seems that everything we see in the world can be accounted for by other principles, supposing God did not exist. For all natural things can be reduced to one principle which is nature; and all voluntary things can be reduced to one principle which is human reason, or will. Therefore there is no need to suppose God's existence.</blockquote> In turn, Aquinas answers this with the ''[[quinque viae]]'', and addresses the particular objection above with the following answer: <blockquote>Since nature works for a determinate end under the direction of a higher agent, whatever is done by nature must needs be traced back to God, as to its first cause. So also whatever is done voluntarily must also be traced back to some higher cause other than human reason or will, since these can change or fail; for all things that are changeable and capable of defect must be traced back to an immovable and self-necessary first principle, as was shown in the body of the Article.</blockquote> Rather than argue for the necessity of a god, some [[Theism|theists]] base their belief upon grounds independent of, or prior to, reason, making Occam's razor irrelevant. This was the stance of [[Søren Kierkegaard]], who viewed belief in God as a [[leap of faith]] that sometimes directly opposed reason.<ref>McDonald 2005.</ref> This is also the doctrine of [[Gordon Clark]]'s [[presuppositional apologetics]], with the exception that Clark never thought the leap of faith was contrary to reason (see also [[Fideism]]). Various [[Arguments for the existence of God|arguments in favor of God]] establish God as a useful or even necessary assumption. Contrastingly some anti-theists hold firmly to the belief that assuming the existence of God introduces unnecessary complexity (e.g., the [[Ultimate Boeing 747 gambit]] from Dawkins's ''[[The God Delusion]]''<ref>{{Cite book |last=Dawkins |first=Richard |title=The God delusion |date=January 1, 2007 |publisher=Black Swan |isbn=978-0-552-77331-7 |location=London |pages=157–158}}</ref>).<ref>{{Cite book |last1=Schmitt |first1=Carl |url=http://dx.doi.org/10.7208/chicago/9780226738901.001.0001 |title=Political Theology |last2=Schwab |first2=George |last3=Strong |first3=Tracy B. |date=2005 |publisher=University of Chicago Press |doi=10.7208/chicago/9780226738901.001.0001 |isbn=978-0-226-73889-5}}</ref> Another application of the principle is to be found in the work of [[George Berkeley]] (1685–1753). Berkeley was an idealist who believed that all of reality could be explained in terms of the mind alone. He invoked Occam's razor against [[materialism]], stating that matter was not required by his metaphysics and was thus eliminable. One potential problem with this belief{{For whom | date = February 2021 }} is that it's possible, given Berkeley's position, to find [[solipsism]] itself more in line with the razor than a God-mediated world beyond a single thinker. Occam's razor may also be recognized in the apocryphal story about an exchange between [[Pierre-Simon Laplace]] and [[Napoleon]]. It is said that in praising Laplace for one of his recent publications, the emperor asked how it was that the name of God, which featured so frequently in the writings of [[Lagrange]], appeared nowhere in Laplace's. At that, he is said to have replied, "It's because I had no need of that hypothesis."<ref>p. 282, [https://books.google.com/books?id=88xZAAAAcAAJ ''Mémoires du docteur F. Antommarchi, ou les derniers momens de Napoléon''] {{webarchive|url=https://web.archive.org/web/20160514072842/https://books.google.com/books?id=88xZAAAAcAAJ |date=14 May 2016 }}, vol. 1, 1825, Paris: Barrois L'Ainé</ref> Though some points of this story illustrate Laplace's [[atheism]], more careful consideration suggests that he may instead have intended merely to illustrate the power of [[methodological naturalism]], or even simply that the fewer [[premise|logical premises]] one assumes, the [[List of mathematical jargon#strong|stronger]] is one's conclusion. === Philosophy of mind === In his article "Sensations and Brain Processes" (1959), [[J. J. C. Smart]] invoked Occam's razor with the aim to justify his preference of the [[mind-brain identity theory]] over [[mind-body dualism|spirit-body dualism]]. Dualists state that there are two kinds of substances in the universe: physical (including the body) and spiritual, which is non-physical. In contrast, identity theorists state that everything is physical, including consciousness, and that there is nothing nonphysical. Though it is impossible to appreciate the spiritual when limiting oneself to the physical,{{Citation needed|date=November 2020}} Smart maintained that identity theory explains all phenomena by assuming only a physical reality. Subsequently, Smart has been severely criticized for his use (or misuse) of Occam's razor and ultimately retracted his advocacy of it in this context. [[Paul Churchland]] (1984) states that by itself Occam's razor is inconclusive regarding duality. In a similar way, Dale Jacquette (1994) stated that Occam's razor has been used in attempts to justify eliminativism and reductionism in the philosophy of mind. Eliminativism is the thesis that the ontology of [[folk psychology]] including such entities as "pain", "joy", "desire", "fear", etc., are eliminable in favor of an ontology of a completed neuroscience. === Penal ethics === In penal theory and the philosophy of punishment, parsimony refers specifically to taking care in the distribution of [[punishment]] in order to avoid excessive punishment. In the [[utilitarianism|utilitarian]] approach to the philosophy of punishment, [[Jeremy Bentham]]'s "parsimony principle" states that any punishment greater than is required to achieve its end is unjust. The concept is related but not identical to the legal concept of [[proportionality (law)|proportionality]]. Parsimony is a key consideration of the modern [[restorative justice]], and is a component of utilitarian approaches to punishment, as well as the [[prison abolition movement]]. Bentham believed that true parsimony would require punishment to be individualised to take account of the [[sensibility]] of the individual—an individual more sensitive to punishment should be given a proportionately lesser one, since otherwise needless pain would be inflicted. Later utilitarian writers have tended to abandon this idea, in large part due to the impracticality of determining each alleged criminal's relative sensitivity to specific punishments.<ref>{{Cite journal |last=Tonry |first=Michael |year=2005 |title=Obsolescence and Immanence in Penal Theory and Policy |url=http://www.columbialawreview.org/pdf/Tonry-Web.pdf |journal=[[Columbia Law Review]] |volume=105 |pages=1233–1275|archive-url=https://web.archive.org/web/20060623074821/http://www.columbialawreview.org/pdf/Tonry-Web.pdf |archive-date=23 June 2006 }}</ref> === Probability theory and statistics === Marcus Hutter's universal artificial intelligence builds upon [[Solomonoff's theory of inductive inference|Solomonoff's mathematical formalization of the razor]] to calculate the expected value of an action. There are various papers in scholarly journals deriving formal versions of Occam's razor from probability theory, applying it in [[statistical inference]], and using it to come up with criteria for penalizing complexity in statistical inference. Papers<ref name="ReferenceC">{{Cite journal |last1=Wallace |first1=C. S. |last2=Boulton |first2=D. M. |date=1968-08-01 |title=An Information Measure for Classification |url=https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/11.2.185 |journal=The Computer Journal |language=en |volume=11 |issue=2 |pages=185–194 |doi=10.1093/comjnl/11.2.185 }}</ref><ref name="auto">{{Cite journal |last=Wallace |first=C. S. |date=1999-04-01 |title=Minimum Message Length and Kolmogorov Complexity |url=https://www.csse.monash.edu/~dld/Publications/1999/WallaceDowe1999aMinimumMessageLengthAndKolmogorovComplexity.pdf |journal=The Computer Journal |language=en |volume=42 |issue=4 |pages=270–283 |doi=10.1093/comjnl/42.4.270 }}</ref> have suggested a connection between Occam's razor and [[Kolmogorov complexity]].<ref name="Volker">{{Cite web |url=http://volker.nannen.com/pdf/short_introduction_to_model_selection.pdf |title=A short introduction to Model Selection, Kolmogorov Complexity and Minimum Description Length |last=Nannen |first=Volker |url-status=live |archive-url=https://web.archive.org/web/20100602044851/http://volker.nannen.com/pdf/short_introduction_to_model_selection.pdf |archive-date=2 June 2010 |access-date=3 July 2010}}</ref> One of the problems with the original formulation of the razor is that it only applies to models with the same explanatory power (i.e., it only tells us to prefer the simplest of equally good models). A more general form of the razor can be derived from Bayesian model comparison, which is based on [[Bayes factor]]s and can be used to compare models that do not fit the observations equally well. These methods can sometimes optimally balance the complexity and power of a model. Generally, the exact Occam factor is intractable, but approximations such as [[Akaike information criterion]], [[Bayesian information criterion]], [[Variational Bayesian methods]], [[false discovery rate]], and [[Laplace's method]] are used. Many [[artificial intelligence]] researchers are now employing such techniques, for instance through work on [[Occam Learning]] or more generally on the [[Free energy principle]]. Statistical versions of Occam's razor have a more rigorous formulation than what philosophical discussions produce. In particular, they must have a specific definition of the term ''simplicity'', and that definition can vary. For example, in the [[Andrey Kolmogorov|Kolmogorov]]–[[Gregory Chaitin|Chaitin]] [[minimum description length]] approach, the subject must pick a [[Turing machine]] whose operations describe the basic operations ''believed'' to represent "simplicity" by the subject. However, one could always choose a Turing machine with a simple operation that happened to construct one's entire theory and would hence score highly under the razor. This has led to two opposing camps: one that believes Occam's razor is objective, and one that believes it is subjective. ==== Objective razor ==== The minimum instruction set of a [[universal Turing machine]] requires approximately the same length description across different formulations, and is small compared to the [[Kolmogorov complexity]] of most practical theories. [[Marcus Hutter]] has used this consistency to define a "natural" Turing machine of small size as the proper basis for excluding arbitrarily complex instruction sets in the formulation of razors.<ref>{{Cite web |url=http://www.hutter1.net/ait.htm |title=Algorithmic Information Theory |url-status=live |archive-url=https://web.archive.org/web/20071224043538/http://www.hutter1.net/ait.htm |archive-date=24 December 2007}}</ref> Describing the program for the universal program as the "hypothesis", and the representation of the evidence as program data, it has been formally proven under [[Zermelo–Fraenkel set theory]] that "the sum of the log universal probability of the model plus the log of the probability of the data given the model should be minimized."<ref>{{Cite journal |last1=Vitanyi |first1=P.M.B. |last2=Ming Li |date=March 2000 |title=Minimum description length induction, Bayesianism, and Kolmogorov complexity |url=https://ieeexplore.ieee.org/document/825807 |journal=IEEE Transactions on Information Theory |volume=46 |issue=2 |pages=446–464 |doi=10.1109/18.825807|arxiv=cs/9901014 }}</ref> Interpreting this as minimising the total length of a two-part message encoding model followed by data given model gives us the [[minimum message length]] (MML) principle.<ref name="ReferenceC" /><ref name="auto" /> One possible conclusion from mixing the concepts of Kolmogorov complexity and Occam's razor is that an ideal data compressor would also be a scientific explanation/formulation generator. Some attempts have been made to re-derive known laws from considerations of simplicity or compressibility.<ref name="ReferenceB" /><ref>{{Cite journal |last=Standish |first=Russell K |year=2000 |title=Why Occam's Razor |journal=Foundations of Physics Letters |volume=17 |issue=3 |pages=255–266 |arxiv=physics/0001020 |bibcode=2004FoPhL..17..255S |doi=10.1023/B:FOPL.0000032475.18334.0e|s2cid=17143230 }}</ref> According to [[Jürgen Schmidhuber]], the appropriate mathematical theory of Occam's razor already exists, namely, [[Ray Solomonoff|Solomonoff's]] [[Solomonoff's theory of inductive inference|theory of optimal inductive inference]]<ref>{{Cite journal |last=Solomonoff |first=Ray |author-link=Ray Solomonoff |year=1964 |title=A formal theory of inductive inference. Part I. |journal=Information and Control |volume=7 |issue=1–22 |page=1964 |doi=10.1016/s0019-9958(64)90223-2|doi-access=free }}</ref> and its extensions.<ref>{{Cite book |title=Artificial General Intelligence |last=Schmidhuber |first=J. |year=2006 |editor-last=Goertzel |editor-first=B. |pages=177–200 |chapter=The New AI: General & Sound & Relevant for Physics |arxiv=cs.AI/0302012 |author-link=Jürgen Schmidhuber |editor-last2=Pennachin |editor-first2=C.}}</ref> See discussions in David L. Dowe's "Foreword re C. S. Wallace"<ref>{{Cite journal |last=Dowe |first=David L. |year=2008 |title=Foreword re C. S. Wallace |journal=Computer Journal |volume=51 |issue=5 |pages=523–560 |doi=10.1093/comjnl/bxm117|s2cid=5387092 }}</ref> for the subtle distinctions between the [[algorithmic probability]] work of Solomonoff and the MML work of [[Chris Wallace (computer scientist)|Chris Wallace]], and see Dowe's "MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness"<ref>David L. Dowe (2010): "MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness. A formal theory of inductive inference." ''Handbook of the Philosophy of Science''{{spaced ndash}}(HPS Volume 7) Philosophy of Statistics, Elsevier 2010 Page(s):901–982. https://web.archive.org/web/20140204001435/http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.709&rep=rep1&type=pdf</ref> both for such discussions and for (in section 4) discussions of MML and Occam's razor. For a specific example of MML as Occam's razor in the problem of decision tree induction, see Dowe and Needham's "Message Length as an Effective Ockham's Razor in Decision Tree Induction".<ref>Scott Needham and David L. Dowe (2001):" Message Length as an Effective Ockham's Razor in Decision Tree Induction." Proc. 8th International Workshop on Artificial Intelligence and Statistics (AI+STATS 2001), Key West, Florida, U.S.A., January 2001 Page(s): 253–260 {{Cite web |url=http://www.csse.monash.edu.au/~dld/Publications/2001/Needham+Dowe2001_Ockham.pdf |title=2001 Ockham.pdf |url-status=live |archive-url=https://web.archive.org/web/20150923211645/http://www.csse.monash.edu.au/~dld/Publications/2001/Needham+Dowe2001_Ockham.pdf |archive-date=23 September 2015 |access-date=2 September 2015}}</ref> ==== Mathematical arguments against Occam's razor ==== {{Technical|date=February 2024|section}} The [[No free lunch theorem|no free lunch]] (NFL) theorems for inductive inference prove that Occam's razor must rely on ultimately arbitrary assumptions concerning the prior probability distribution found in our world.<ref name="Adam2019">Adam, S., and Pardalos, P. (2019), [https://www.researchgate.net/profile/Stamatios-Aggelos-Alexandropoulos-2/publication/333007007_No_Free_Lunch_Theorem_A_Review/links/5e84f65792851c2f52742c85/No-Free-Lunch-Theorem-A-Review.pdf No-free lunch Theorem: A review], in "Approximation and Optimization", Springer, 57-82</ref> Specifically, suppose one is given two inductive inference algorithms, A and B, where A is a [[Bayesian inference|Bayesian]] procedure based on the choice of some prior distribution motivated by Occam's razor (e.g., the prior might favor hypotheses with smaller [[Kolmogorov complexity]]). Suppose that B is the anti-Bayes procedure, which calculates what the Bayesian algorithm A based on Occam's razor will predict – and then predicts the exact opposite. Then there are just as many actual priors (including those different from the Occam's razor prior assumed by A) in which algorithm B outperforms A as priors in which the procedure A based on Occam's razor comes out on top. In particular, the NFL theorems show that the "Occam factors" Bayesian argument for Occam's razor must make ultimately arbitrary modeling assumptions.<ref name="WOLP95">Wolpert, D.H (1995), On the Bayesian "Occam Factors" Argument for Occam's Razor, in "Computational Learning Theory and Natural Learning Systems: Selecting Good Models", MIT Press</ref> === Software development === In software development, the [[rule of least power]] argues the correct [[programming language]] to use is the one that is simplest while also solving the targeted software problem. In that form the rule is often credited to [[Tim Berners-Lee]] since it appeared in his design guidelines for the original [[Hypertext Transfer Protocol]].<ref>{{Cite web |url=https://www.w3.org/DesignIssues/Principles.html |first=Tim |last=Berners-Lee |author-link=Tim Berners-Lee |date=4 March 2013 |title=Principles of Design |website=[[World Wide Web Consortium]] |access-date=5 June 2022 |archive-date=15 June 2022 |archive-url=https://web.archive.org/web/20220615065514/https://www.w3.org/DesignIssues/Principles.html |url-status=live }}</ref> Complexity in this context is measured either by placing a language into the [[Chomsky hierarchy]] or by listing idiomatic features of the language and comparing according to some agreed to scale of difficulties between idioms. Many languages once thought to be of lower complexity have evolved or later been discovered to be more complex than originally intended; so, in practice this rule is applied to the relative ease of a programmer to obtain the power of the language, rather than the precise theoretical limits of the language. === Artificial intelligence === Scientists have discovered that [[deep neural network]]s (DNN) prefer simpler mathematical functions while learning. This simplicity bias enables DNNs to overcome [[overfitting]] - a scenario where the model gets overwhelmed with noise due to the presence of too many parameters.<ref>{{Cite journal |last1=Mingard |first1=Chris |last2=Rees |first2=Henry |last3=Valle-Pérez |first3=Guillermo |last4=Louis |first4=Ard A. |date=2025-01-14 |title=Deep neural networks have an inbuilt Occam's razor |journal=Nature Communications |language=en |volume=16 |issue=1 |pages=220 |doi=10.1038/s41467-024-54813-x |pmid=39809746 |issn=2041-1723|pmc=11733143 |arxiv=2304.06670 |bibcode=2025NatCo..16..220M }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)