Editing Artificial intelligence (section)

=== Deep learning ===
[[File:AI hierarchy.svg|thumb|upright]]
[[Deep learning]]<ref name="Deep learning">[[Deep learning]]: {{Harvtxt|Russell|Norvig|2021|loc=chpt. 21}}, {{Harvtxt|Goodfellow|Bengio|Courville|2016}}, {{Harvtxt|Hinton ''et al.''|2016}}, {{Harvtxt|Schmidhuber|2015}}</ref> uses several layers of neurons between the network's inputs and outputs. The multiple layers can progressively extract higher-level features from the raw input. For example, in [[image processing]], lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits, letters, or faces.{{Sfnp|Deng|Yu|2014|pp=199–200}}

Deep learning has profoundly improved the performance of programs in many important subfields of artificial intelligence, including [[computer vision]], [[speech recognition]], [[natural language processing]], [[image classification]],{{Sfnp|Ciresan|Meier|Schmidhuber|2012}} and others. The reason that deep learning performs so well in so many applications is not known as of 2021.{{Sfnp|Russell|Norvig|2021|p=750}} The sudden success of deep learning in 2012–2015 did not occur because of some new discovery or theoretical breakthrough (deep neural networks and backpropagation had been described by many people, as far back as the 1950s){{Efn|
Some form of deep neural networks (without a specific learning algorithm) were described by:
[[Warren S. McCulloch]] and [[Walter Pitts]] (1943){{Sfnp|Russell|Norvig|2021|p=17}}
[[Alan Turing]] (1948);{{Sfnp|Russell|Norvig|2021|p=785}}
[[Karl Steinbuch]] and [[Roger David Joseph]] (1961).{{Sfnp|Schmidhuber|2022|loc=sect. 5}}
Deep or recurrent networks that learned (or used gradient descent) were developed by:
[[Frank Rosenblatt]](1957);{{Sfnp|Russell|Norvig|2021|p=785}}
[[Oliver Selfridge]] (1959);{{Sfnp|Schmidhuber|2022|loc=sect. 5}}
[[Alexey Ivakhnenko]] and [[Valentin Lapa]] (1965);{{Sfnp|Schmidhuber|2022|loc=sect. 6}}
[[Kaoru Nakano]] (1971);{{Sfnp|Schmidhuber|2022|loc=sect. 7}}
[[Shun-Ichi Amari]] (1972);{{Sfnp|Schmidhuber|2022|loc=sect. 7}}
[[John Joseph Hopfield]] (1982).{{Sfnp|Schmidhuber|2022|loc=sect. 7}}
Precursors to backpropagation were developed by:
[[Henry J. Kelley]] (1960);{{Sfnp|Russell|Norvig|2021|p=785}}
[[Arthur E. Bryson]] (1962);{{Sfnp|Russell|Norvig|2021|p=785}}
[[Stuart Dreyfus]] (1962);{{Sfnp|Russell|Norvig|2021|p=785}}
[[Arthur E. Bryson]] and [[Yu-Chi Ho]] (1969);{{Sfnp|Russell|Norvig|2021|p=785}}
Backpropagation was independently developed by:
[[Seppo Linnainmaa]] (1970);{{Sfnp|Schmidhuber|2022|loc=sect. 8}}
[[Paul Werbos]] (1974).{{Sfnp|Russell|Norvig|2021|p=785}}
}} but because of two factors: the incredible increase in computer power (including the hundred-fold increase in speed by switching to [[GPU]]s) and the availability of vast amounts of training data, especially the giant [[List of datasets for machine-learning research|curated datasets]] used for benchmark testing, such as [[ImageNet]].{{Efn|[[Geoffrey Hinton]] said, of his work on neural networks in the 1990s, "our labeled datasets were thousands of times too small. [And] our computers were millions of times too slow."<ref>Quoted in {{Harvtxt|Christian|2020|p=22}}</ref>}}