Editing Friendly artificial intelligence (section)

== Risks of unfriendly AI ==
{{Main|Existential risk from artificial general intelligence}}
The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the [[golem]], or the proto-robots of [[Gerbert of Aurillac]] and [[Roger Bacon]].  In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict.<ref>{{cite journal|url=https://www.academia.edu/704751|author=Kevin LaGrandeur|title=The Persistent Peril of the Artificial Slave|journal=Science Fiction Studies|year=2011|volume=38|issue=2|page=232|doi=10.5621/sciefictstud.38.2.0232|access-date=2013-05-06|author-link=Kevin LaGrandeur|archive-date=2023-01-13|archive-url=https://web.archive.org/web/20230113152138/https://www.academia.edu/704751|url-status=live}}</ref> By 1942 these themes prompted [[Isaac Asimov]] to create the "[[Three Laws of Robotics]]"—principles hard-wired into all the robots in his fiction, intended to prevent them from turning on their creators, or allowing them to come to harm.<ref>{{cite book| title=The Rest of the Robots| chapter-url=https://archive.org/details/restofrobots00asim| chapter-url-access=registration| publisher=Doubleday| year=1964| isbn=0-385-09041-2| chapter=Introduction| author=Isaac Asimov}}</ref>
 
In modern times as the prospect of [[Superintelligence|superintelligent AI]] looms nearer, philosopher [[Nick Bostrom]] has said that superintelligent AI systems with goals that are not aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:

<blockquote>Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'</blockquote>

In 2008, Eliezer Yudkowsky called for the creation of "friendly AI" to mitigate [[existential risk from advanced artificial intelligence]]. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."<ref>{{cite book |author=[[Eliezer Yudkowsky]] |year=2008 |chapter-url=http://intelligence.org/files/AIPosNegFactor.pdf |chapter=Artificial Intelligence as a Positive and Negative Factor in Global Risk |title=Global Catastrophic Risks |pages=308–345 |editor1=Nick Bostrom |editor2=Milan M. Ćirković |access-date=2013-10-19 |archive-date=2013-10-19 |archive-url=https://web.archive.org/web/20131019182403/http://intelligence.org/files/AIPosNegFactor.pdf |url-status=live }}</ref>

[[Steve Omohundro]] says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of [[Instrumental convergence#Basic AI drives|basic "drives"]], such as resource acquisition, [[self-preservation]], and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior.<ref>{{cite journal |last=Omohundro |first=S. M. |date=February 2008 |title=The basic AI drives |journal=Artificial General Intelligence |volume=171 |pages=483–492 |citeseerx=10.1.1.393.8356}}</ref><ref>{{cite book|last1=Bostrom|first1=Nick|title=Superintelligence: Paths, Dangers, Strategies|date=2014|publisher=Oxford University Press|location=Oxford|isbn=9780199678112|title-link=Superintelligence: Paths, Dangers, Strategies |chapter=Chapter 7: The Superintelligent Will}}</ref>

[[Alexander Wissner-Gross]] says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.<ref>{{cite web | last=Dvorsky | first=George | title=How Skynet Might Emerge From Simple Physics | website=Gizmodo | date=2013-04-26 | url=https://gizmodo.com/how-skynet-might-emerge-from-simple-physics-482402911 | access-date=2021-12-23 | archive-date=2021-10-08 | archive-url=https://web.archive.org/web/20211008105300/https://gizmodo.com/how-skynet-might-emerge-from-simple-physics-482402911 | url-status=live }}</ref><ref>{{cite journal | last1 = Wissner-Gross | first1 = A. D. | author-link1 = Alexander Wissner-Gross | last2 = Freer | first2 = C. E. | author-link2 = Cameron Freer | year = 2013 | title = Causal entropic forces | journal = Physical Review Letters | volume = 110 | issue = 16 | page = 168702 | doi = 10.1103/PhysRevLett.110.168702 | pmid = 23679649 | bibcode = 2013PhRvL.110p8702W | doi-access = free | hdl = 1721.1/79750 | hdl-access = free }}</ref>

Luke Muehlhauser, writing for the [[Machine Intelligence Research Institute]], recommends that [[machine ethics]] researchers adopt what [[Bruce Schneier]] has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.<ref name=MuehlhauserSecurity2013>{{cite web|last1=Muehlhauser|first1=Luke|title=AI Risk and the Security Mindset|url=http://intelligence.org/2013/07/31/ai-risk-and-the-security-mindset/|website=Machine Intelligence Research Institute|access-date=15 July 2014|date=31 Jul 2013|archive-date=19 July 2014|archive-url=https://web.archive.org/web/20140719205835/http://intelligence.org/2013/07/31/ai-risk-and-the-security-mindset/|url-status=live}}</ref>

In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI';<ref name=think13>{{Cite journal|last1=Muehlhauser|first1=Luke|last2=Bostrom|first2=Nick|title=Why We Need Friendly AI|date=2013-12-17|journal=Think|volume=13|issue=36|pages=41–47|doi=10.1017/s1477175613000316|s2cid=143657841|issn=1477-1756}}</ref> nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.<ref name=boyles2019>{{Cite journal|last1=Boyles|first1=Robert James M.|last2=Joaquin|first2=Jeremiah Joven|date=2019-07-23|title=Why friendly AIs won't be that friendly: a friendly reply to Muehlhauser and Bostrom|journal=AI & Society|volume=35|issue=2|pages=505–507|doi=10.1007/s00146-019-00903-0|s2cid=198190745|issn=0951-5666}}</ref><ref>{{Cite journal|last=Chan|first=Berman|date=2020-03-04|title=The rise of artificial intelligence and the crisis of moral passivity|journal=AI & Society|volume=35|issue=4|pages=991–993|language=en|doi=10.1007/s00146-020-00953-9|s2cid=212407078|issn=1435-5655|url=https://philpapers.org/rec/CHATRO-56|access-date=2023-01-21|archive-date=2023-02-10|archive-url=https://web.archive.org/web/20230210114013/https://philpapers.org/rec/CHATRO-56|url-status=live}}</ref>