Editing Neural network (machine learning) (section)

==Criticism==

===Training ===
A common criticism of neural networks, particularly in robotics, is that they require too many training samples for real-world operation.<ref>{{cite journal |last1=Parisi |first1=German I. |last2=Kemker |first2=Ronald |last3=Part |first3=Jose L. |last4=Kanan |first4=Christopher |last5=Wermter |first5=Stefan |date=1 May 2019 |title=Continual lifelong learning with neural networks: A review |journal=Neural Networks |volume=113 |pages=54–71 |doi=10.1016/j.neunet.2019.01.012 |pmid=30780045 |issn=0893-6080|doi-access=free |arxiv=1802.07569 }}</ref>
Any learning machine needs sufficient representative examples in order to capture the underlying structure that allows it to generalize to new cases. Potential solutions include randomly shuffling training examples, by using a numerical optimization algorithm that does not take too large steps when changing the network connections following an example, grouping examples in so-called mini-batches and/or introducing a recursive least squares algorithm for [[cerebellar model articulation controller|CMAC]].<ref name="Qin1"/>
Dean Pomerleau uses a neural network to train a robotic vehicle to drive on multiple types of roads (single lane, multi-lane, dirt, etc.), and a large amount of his research is devoted to extrapolating multiple training scenarios from a single training experience, and preserving past training diversity so that the system does not become overtrained (if, for example, it is presented with a series of right turns—it should not learn to always turn right).<ref>Dean Pomerleau, "Knowledge-based Training of Artificial Neural Networks for Autonomous Robot Driving"</ref>

===Theory===

A central claim{{citation needed|date=January 2023}} of ANNs is that they embody new and powerful general principles for processing information. These principles are ill-defined. It is often claimed{{by whom|date=January 2023}} that they are [[Emergent properties|emergent]] from the network itself. This allows simple statistical association (the basic function of artificial neural networks) to be described as learning or recognition. In 1997, [[Alexander Dewdney]], a former ''[[Scientific American]]'' columnist, commented that as a result, artificial neural networks have a "something-for-nothing quality, one that imparts a peculiar aura of laziness and a distinct lack of curiosity about just how good these computing systems are. No human hand (or mind) intervenes; solutions are found as if by magic; and no one, it seems, has learned anything".<ref>{{cite book|url={{google books |plainurl=y |id=KcHaAAAAMAAJ|page=82}}|title=Yes, we have no neutrons: an eye-opening tour through the twists and turns of bad science|last=Dewdney|first=A. K.|date=1 April 1997|publisher=Wiley|isbn=978-0-471-10806-1|page=82}}</ref> One response to Dewdney is that neural networks have been successfully used to handle many complex and diverse tasks, ranging from autonomously flying aircraft<ref>[http://www.nasa.gov/centers/dryden/news/NewsReleases/2003/03-49.html NASA – Dryden Flight Research Center – News Room: News Releases: NASA NEURAL NETWORK PROJECT PASSES MILESTONE] {{Webarchive|url=https://web.archive.org/web/20100402065100/http://www.nasa.gov/centers/dryden/news/NewsReleases/2003/03-49.html |date=2 April 2010 }}. Nasa.gov. Retrieved on 20 November 2013.</ref> to detecting credit card fraud to mastering the game of [[Go (game)|Go]].

Technology writer Roger Bridgman commented:

{{blockquote|Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table...valueless as a scientific resource".
In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers. An unreadable table that a useful machine could read would still be well worth having.<ref>{{Cite web |url=http://members.fortunecity.com/templarseries/popper.html |title=Roger Bridgman's defence of neural networks |access-date=12 July 2010 |archive-url=https://web.archive.org/web/20120319163352/http://members.fortunecity.com/templarseries/popper.html |archive-date=19 March 2012 }}</ref>
}}

Although it is true that analyzing what has been learned by an artificial neural network is difficult, it is much easier to do so than to analyze what has been learned by a biological neural network. Moreover, recent emphasis on the [[Explainable artificial intelligence|explainability]] of AI has contributed towards the development of methods, notably those based on [[Attention (machine learning)|attention]] mechanisms, for visualizing and explaining learned neural networks. Furthermore, researchers involved in exploring learning algorithms for neural networks are gradually uncovering generic principles that allow a learning machine to be successful. For example, Bengio and LeCun (2007) wrote an article regarding local vs non-local learning, as well as shallow vs deep architecture.<ref>{{cite web|url=http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/4|title=Scaling Learning Algorithms towards {AI} – LISA – Publications – Aigaion 2.0|website=iro.umontreal.ca |url-status=dead }}</ref>

Biological brains use both shallow and deep circuits as reported by brain anatomy,<ref name="VanEssen1991">D. J. Felleman and D. C. Van Essen, "[https://archive.today/20150120022056/http://cercor.oxfordjournals.org/content/1/1/1.1.full.pdf+html Distributed hierarchical processing in the primate cerebral cortex]," ''Cerebral Cortex'', 1, pp. 1–47, 1991.</ref> displaying a wide variety of invariance. Weng<ref name="Weng2012">J. Weng, "[https://www.amazon.com/Natural-Artificial-Intelligence-Introduction-Computational/dp/0985875720 Natural and Artificial Intelligence: Introduction to Computational Brain-Mind] {{Webarchive|url=https://web.archive.org/web/20240519082645/https://www.amazon.com/Natural-Artificial-Intelligence-Introduction-Computational/dp/0985875720 |date=19 May 2024 }}," BMI Press, {{ISBN|978-0-9858757-2-5}}, 2012.</ref> argued that the brain self-wires largely according to signal statistics and therefore, a serial cascade cannot catch all major statistical dependencies.

===Hardware===
Large and effective neural networks require considerable computing resources.<ref name=":0">{{cite journal|last1=Edwards|first1=Chris|s2cid=11026540|title=Growing pains for deep learning|journal=Communications of the ACM|date=25 June 2015|volume=58|issue=7|pages=14–16|doi=10.1145/2771283}}</ref> While the brain has hardware tailored to the task of processing signals through a [[Graph (discrete mathematics)|graph]] of neurons, simulating even a simplified neuron on [[von Neumann architecture]] may consume vast amounts of [[Random-access memory|memory]] and storage. Furthermore, the designer often needs to transmit signals through many of these connections and their associated neurons{{snd}} which require enormous [[Central processing unit|CPU]] power and time.{{citation needed|date=October 2024}}

Some argue that the resurgence of neural networks in the twenty-first century is largely attributable to advances in hardware: from 1991 to 2015, computing power, especially as delivered by [[General-purpose computing on graphics processing units|GPGPUs]] (on [[Graphics processing unit|GPUs]]), has increased around a million-fold, making the standard backpropagation algorithm feasible for training networks that are several layers deeper than before.<ref name="SCHIDHUB4"/> The use of accelerators such as [[Field-programmable gate array|FPGA]]s and GPUs can reduce training times from months to days.{{r|:0}}<ref>{{Cite web |title=The Bitter Lesson |url=http://www.incompleteideas.net/IncIdeas/BitterLesson.html |access-date=7 August 2024 |website=incompleteideas.net}}</ref>

[[Neuromorphic engineering]] or a [[physical neural network]] addresses the hardware difficulty directly, by constructing non-von-Neumann chips to directly implement neural networks in circuitry. Another type of chip optimized for neural network processing is called a [[Tensor Processing Unit]], or TPU.<ref>{{cite news |url=https://www.wired.com/2016/05/google-tpu-custom-chips/ |author=Cade Metz |newspaper=Wired |date=18 May 2016 |title=Google Built Its Very Own Chips to Power Its AI Bots |access-date=5 March 2017 |archive-date=13 January 2018 |archive-url=https://web.archive.org/web/20180113150305/https://www.wired.com/2016/05/google-tpu-custom-chips/ |url-status=live }}</ref>

===Practical counterexamples ===
Analyzing what has been learned by an ANN is much easier than analyzing what has been learned by a biological neural network. Furthermore, researchers involved in exploring learning algorithms for neural networks are gradually uncovering general principles that allow a learning machine to be successful. For example, local vs. non-local learning and shallow vs. deep architecture.<ref>{{Cite web|title=Scaling Learning Algorithms towards AI|url=http://yann.lecun.com/exdb/publis/pdf/bengio-lecun-07.pdf|access-date=6 July 2022|archive-date=12 August 2022|archive-url=https://web.archive.org/web/20220812081157/http://yann.lecun.com/exdb/publis/pdf/bengio-lecun-07.pdf|url-status=live}}</ref>

===Hybrid approaches===
Advocates of [[Hybrid neural network|hybrid]] models (combining neural networks and symbolic approaches) say that such a mixture can better capture the mechanisms of the human mind.<ref>{{Cite journal| last1=Tahmasebi| last2=Hezarkhani| title=A hybrid neural networks-fuzzy logic-genetic algorithm for grade estimation| year=2012| journal=Computers & Geosciences| pages=18–27 |volume=42| doi=10.1016/j.cageo.2012.02.004| pmid=25540468| pmc=4268588| bibcode=2012CG.....42...18T}}</ref><ref>Sun and Bookman, 1990</ref>

=== Dataset bias ===
Neural networks are dependent on the quality of the data they are trained on, thus low quality data with imbalanced representativeness can lead to the model learning and perpetuating societal biases.<ref name=":010">{{Cite journal |last1=Norori |first1=Natalia |last2=Hu |first2=Qiyang |last3=Aellen |first3=Florence Marcelle |last4=Faraci |first4=Francesca Dalia |last5=Tzovara |first5=Athina |date=October 2021 |title=Addressing bias in big data and AI for health care: A call for open science |journal=Patterns |language=en |volume=2 |issue=10 |page=100347 |doi=10.1016/j.patter.2021.100347|doi-access=free |pmid=34693373 |pmc=8515002 }}</ref><ref name=":17">{{Cite journal |last=Carina |first=Wang |date=27 October 2022 |title=Failing at Face Value: The Effect of Biased Facial Recognition Technology on Racial Discrimination in Criminal Justice |journal=Scientific and Social Research |volume=4 |issue=10 |pages=29–40 |doi=10.26689/ssr.v4i10.4402 |issn=2661-4332|doi-access=free }}</ref> These inherited biases become especially critical when the ANNs are integrated into real-world scenarios where the training data may be imbalanced due to the scarcity of data for a specific race, gender or other attribute.<ref name=":010" /> This imbalance can result in the model having inadequate representation and understanding of underrepresented groups, leading to discriminatory outcomes that exacerbate societal inequalities, especially in applications like [[Facial recognition system|facial recognition]], hiring processes, and [[law enforcement]].<ref name=":17" /><ref name=":22">{{Cite journal |last=Chang |first=Xinyu |date=13 September 2023 |title=Gender Bias in Hiring: An Analysis of the Impact of Amazon's Recruiting Algorithm |url=https://aemps.ewapublishing.org/article.html?pk=e5b93601b03d453c855d54d3153875ba |journal=Advances in Economics, Management and Political Sciences |volume=23 |issue=1 |pages=134–140 |doi=10.54254/2754-1169/23/20230367 |issn=2754-1169 |doi-access=free |access-date=9 December 2023 |archive-date=9 December 2023 |archive-url=https://web.archive.org/web/20231209135207/https://aemps.ewapublishing.org/article.html?pk=e5b93601b03d453c855d54d3153875ba |url-status=live }}</ref> For example, in 2018, [[Amazon (company)|Amazon]] had to scrap a recruiting tool because the model favored men over women for jobs in software engineering due to the higher number of male workers in the field.<ref name=":22" /> The program would penalize any resume with the word "woman" or the name of any women's college. However, the use of [[synthetic data]] can help reduce dataset bias and increase representation in datasets.<ref>{{Cite book |last1=Kortylewski |first1=Adam |last2=Egger |first2=Bernhard |last3=Schneider |first3=Andreas |last4=Gerig |first4=Thomas |last5=Morel-Forster |first5=Andreas |last6=Vetter |first6=Thomas |chapter=Analyzing and Reducing the Damage of Dataset Bias to Face Recognition with Synthetic Data |date=June 2019 |title=2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |pages=2261–2268 |publisher=IEEE |doi=10.1109/cvprw.2019.00279 |isbn=978-1-7281-2506-0 |s2cid=198183828 |url=https://edoc.unibas.ch/75257/1/20200128164027_5e3055eb775f1.pdf |access-date=30 December 2023 |archive-date=19 May 2024 |archive-url=https://web.archive.org/web/20240519082642/https://edoc.unibas.ch/75257/1/20200128164027_5e3055eb775f1.pdf |url-status=live }}</ref>