Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Artificial neuron
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{short description|Mathematical function conceived as a crude model}} [[File:Artificial neuron structure.svg|alt=Artificial neuron structure|thumb|306x306px|Artificial neuron structure]] An '''artificial neuron''' is a [[Function (mathematics)|mathematical function]] conceived as a [[Mathematical model|model]] of a [[neuron|biological neuron]] in a [[neural network]]. The artificial neuron is the elementary unit of an ''[[Neural network (machine learning)|artificial neural network]]''.<ref>{{Cite conference |author=Rami A. Alzahrani |author2=Alice C. Parker |title=Neuromorphic Circuits With Neural Modulation Enhancing the Information Content of Neural Signaling |book-title=Proceedings of International Conference on Neuromorphic Systems 2020 |location=New York |publisher=Association for Computing Machinery |language=en |article-number=19 |isbn=978-1-4503-8851-1 |doi=10.1145/3407197.3407204|s2cid=220794387|doi-access=free}}</ref> The design of the artificial neuron was inspired by biological [[neural circuit]]ry. Its inputs are analogous to [[excitatory postsynaptic potential]]s and [[inhibitory postsynaptic potential]]s at neural [[dendrite]]s, or {{vanchor|activation}}. Its weights are analogous to [[synaptic weight]]s, and its output is analogous to a neuron's [[action potential]] which is transmitted along its [[axon]]. Usually, each input is separately [[weighting|weighted]], and the sum is often added to a term known as a ''bias'' (loosely corresponding to the [[threshold potential]]), before being passed through a [[Nonlinear system|nonlinear function]] known as an [[activation function]]. Depending on the task, these functions could have a [[Sigmoid function|sigmoid]] shape (e.g. for [[binary classification]]), but they may also take the form of other nonlinear functions, [[piecewise]] linear functions, or [[#Step function|step functions]]. They are also often [[Monotonic function|monotonically increasing]], [[Continuous function|continuous]], [[Differentiable function|differentiable]], and [[Bounded function|bounded]]. Non-monotonic, unbounded, and oscillating activation functions with multiple zeros that outperform sigmoidal and [[Rectifier (neural networks)|ReLU-like]] activation functions on many tasks have also been recently explored. The threshold function has inspired building [[logic gate]]s referred to as threshold logic; applicable to building [[logic circuit]]s resembling brain processing. For example, new devices such as [[memristor]]s have been extensively used to develop such logic.<ref>{{Cite journal|last1=Maan|first1=A. K.|last2=Jayadevi|first2=D. A.|last3=James|first3=A. P.|date=1 January 2016|title=A Survey of Memristive Threshold Logic Circuits|journal=IEEE Transactions on Neural Networks and Learning Systems|volume=PP|issue=99|pages=1734–1746|doi=10.1109/TNNLS.2016.2547842|pmid=27164608|issn=2162-237X|arxiv=1604.07121|bibcode=2016arXiv160407121M|s2cid=1798273}}</ref> The artificial neuron activation function should not be confused with a linear system's [[transfer function]]. An artificial neuron may be referred to as a '''semi-linear unit''', '''Nv neuron''', '''binary neuron''', '''linear threshold function''', or '''McCulloch–Pitts''' ('''MCP''') '''neuron''', depending on the structure used. Simple artificial neurons, such as the McCulloch–Pitts model, are sometimes described as "caricature models", since they are intended to reflect one or more neurophysiological observations, but without regard to realism.<ref> {{cite book |author=F. C. Hoppensteadt and E. M. Izhikevich |title=Weakly connected neural networks |publisher=Springer |year=1997 |isbn=978-0-387-94948-2 |page=4}}</ref> Artificial neurons can also refer to [[artificial cell]]s in [[#Physical artificial cells|neuromorphic engineering]] that are similar to natural physical neurons. == Basic structure == For a given artificial neuron <math>k</math>, let there be <math>m+1</math> inputs with signals <math>x_0</math> through <math>x_m</math> and weights <math>w_{k0}</math> through <math>w_{km}</math>. Usually, the input <math>x_0</math> is assigned the value +1, which makes it a bias input with <math>w_{k0} = b_k</math>. This leaves only <math>m</math> actual inputs to the neuron: <math>x_1</math> to <math>x_m</math>. The output of the <math>k</math>-th neuron is: : <math>y_k = \varphi \left(\sum_{j=0}^m w_{kj} x_j \right)</math>, where <math>\varphi</math> (phi) is the activation function. [[File:artificial neuron.png]] The output is analogous to the [[axon]] of a biological neuron, and its value propagates to the input of the next layer, through a synapse. It may also exit the system, possibly as part of an output [[vector (mathematics and physics)|vector]]. It has no learning process as such. Its activation function weights are calculated, and its threshold value is predetermined. == McCulloch–Pitts (MCP) neuron == {{Main|Perceptron}} An MCP neuron is a kind of restricted artificial neuron which operates in discrete time-steps. Each has zero or more inputs, and are written as <math>x_1, ..., x_n</math>. It has one output, written as <math>y</math>. Each input can be either ''excitatory'' or ''inhibitory''. The output can either be ''quiet'' or ''firing''. An MCP neuron also has a threshold <math>b \in \{0, 1, 2, ...\}</math>. In an MCP neural network, all the neurons operate in synchronous discrete time-steps of <math>t = 0, 1, 2, 3, ...</math>. At time <math>t+1</math>, the output of the neuron is <math>y(t+1) = 1</math> if the number of firing excitatory inputs is at least equal to the threshold, and no inhibitory inputs are firing; <math>y(t+1)=0</math> otherwise. Each output can be the input to an arbitrary number of neurons, including itself (i.e., self-loops are possible). However, an output cannot connect more than once with a single neuron. Self-loops do not cause contradictions, since the network operates in synchronous discrete time-steps. As a simple example, consider a single neuron with threshold 0, and a single inhibitory self-loop. Its output would oscillate between 0 and 1 at every step, acting as a "clock". Any [[Finite-state machine|finite state machine]] can be simulated by a MCP neural network.<ref name=":0">{{Cite book |last=Minsky |first=Marvin Lee |title=Computation: Finite and Infinite Machines |date=1967-01-01 |publisher=Prentice Hall |isbn=978-0-13-165563-8 |language=English}}</ref> Furnished with an infinite tape, MCP neural networks can simulate any [[Turing machine]].<ref>{{Cite journal |last1=McCulloch |first1=Warren S. |last2=Pitts |first2=Walter |date=1943-12-01 |title=A logical calculus of the ideas immanent in nervous activity |url=https://doi.org/10.1007/BF02478259 |journal=The Bulletin of Mathematical Biophysics |language=en |volume=5 |issue=4 |pages=115–133 |doi=10.1007/BF02478259 |issn=1522-9602|url-access=subscription }}</ref> ==Biological models== {{main|Biological neuron model}} [[File:Neuron3.svg|thumb|right|400px|Neuron and myelinated axon, with signal flow from inputs at dendrites to outputs at axon terminals]] Artificial neurons are designed to mimic aspects of their biological counterparts. However a significant performance gap exists between biological and artificial neural networks. In particular single biological neurons in the human brain with oscillating activation function capable of learning the [[Exclusive or|XOR function]] have been discovered.<ref>{{Cite journal|last1=Gidon|first1=Albert|last2=Zolnik|first2=Timothy Adam|last3=Fidzinski|first3=Pawel|last4=Bolduan|first4=Felix|last5=Papoutsi|first5=Athanasia|last6=Poirazi|first6=Panayiota|author-link6=Panayiota Poirazi| last7=Holtkamp|first7=Martin|last8=Vida|first8=Imre|last9=Larkum|first9=Matthew Evan|date=2020-01-03|title=Dendritic action potentials and computation in human layer 2/3 cortical neurons|journal=Science|volume=367|issue=6473|pages=83–87|doi=10.1126/science.aax6239|pmid=31896716|bibcode=2020Sci...367...83G|s2cid=209676937|doi-access=free}}</ref> * [[Dendrites]] – in biological neurons, dendrites act as the input vector. These dendrites allow the cell to receive signals from a large (>1000) number of neighboring neurons. As in the above mathematical treatment, each dendrite is able to perform "multiplication" by that dendrite's "weight value." The multiplication is accomplished by increasing or decreasing the ratio of synaptic neurotransmitters to signal chemicals introduced into the dendrite in response to the synaptic neurotransmitter. A negative multiplication effect can be achieved by transmitting signal inhibitors (i.e. oppositely charged ions) along the dendrite in response to the reception of synaptic neurotransmitters. * [[Soma (biology)|Soma]] – in biological neurons, the soma acts as the summation function, seen in the above mathematical description. As positive and negative signals (exciting and inhibiting, respectively) arrive in the soma from the dendrites, the positive and negative ions are effectively added in summation, by simple virtue of being mixed together in the solution inside the cell's body. * [[Axon]] – the axon gets its signal from the summation behavior which occurs inside the soma. The opening to the axon essentially samples the electrical potential of the solution inside the soma. Once the soma reaches a certain potential, the axon will transmit an all-in signal pulse down its length. In this regard, the axon behaves as the ability for us to connect our artificial neuron to other artificial neurons. Unlike most artificial neurons, however, biological neurons fire in discrete pulses. Each time the electrical potential inside the soma reaches a certain threshold, a pulse is transmitted down the axon. This pulsing can be translated into continuous values. The rate (activations per second, etc.) at which an axon fires converts directly into the rate at which neighboring cells get signal ions introduced into them. The faster a biological neuron fires, the faster nearby neurons accumulate electrical potential (or lose electrical potential, depending on the "weighting" of the dendrite that connects to the neuron that fired). It is this conversion that allows computer scientists and mathematicians to simulate biological neural networks using artificial neurons which can output distinct values (often from −1 to 1). ===Encoding=== Research has shown that [[unary coding]] is used in the neural circuits responsible for [[birdsong]] production.<ref>{{cite book|editor1-last=Squire|editor1-first=L.|editor2-last=Albright|editor2-first=T.|editor3-last=Bloom|editor3-first=F.|editor4-last=Gage|editor4-first=F.|editor5-last=Spitzer|editor5-first=N.|title=Neural network models of birdsong production, learning, and coding|date=October 2007|publisher=Elservier|location=New Encyclopedia of Neuroscience|url=https://clm.utexas.edu/fietelab/Papers/birdsong_review_topost.pdf|access-date=12 April 2015|archive-url=https://web.archive.org/web/20150412190625/https://clm.utexas.edu/fietelab/Papers/birdsong_review_topost.pdf|archive-date=2015-04-12}}</ref><ref>{{cite journal | last1 = Moore | first1 = J.M. | display-authors = etal | year = 2011| title = Motor pathway convergence predicts syllable repertoire size in oscine birds | journal = Proc. Natl. Acad. Sci. USA | volume = 108 | issue = 39| pages = 16440–16445 | doi = 10.1073/pnas.1102077108 | pmid = 21918109 | pmc = 3182746 | bibcode = 2011PNAS..10816440M | doi-access = free }}</ref> The use of unary in biological networks is presumably due to the inherent simplicity of the coding. Another contributing factor could be that unary coding provides a certain degree of error correction.<ref>{{cite arXiv|eprint=1411.7406|title=Error Correction Capacity of Unary Coding|first=Pushpa Sree|last=Potluri|date=26 November 2014|class=cs.IT}}</ref> ==Physical artificial cells== There is research and development into physical artificial neurons – organic and inorganic. For example, some artificial neurons can receive<ref name="knowablemagazineorganic">{{cite news |last1=Kleiner |first1=Kurt |title=Making computer chips act more like brain cells |url=https://knowablemagazine.org/article/technology/2022/making-computer-chips-act-more-like-brain-cells |access-date=23 September 2022 |journal=Knowable Magazine |date=25 August 2022 |language=en |doi=10.1146/knowable-082422-1}}</ref><ref>{{cite journal |last1=Keene |first1=Scott T. |last2=Lubrano |first2=Claudia |last3=Kazemzadeh |first3=Setareh |last4=Melianas |first4=Armantas |last5=Tuchman |first5=Yaakov |last6=Polino |first6=Giuseppina |last7=Scognamiglio |first7=Paola |last8=Cinà |first8=Lucio |last9=Salleo |first9=Alberto |last10=van de Burgt |first10=Yoeri |last11=Santoro |first11=Francesca |title=A biohybrid synapse with neurotransmitter-mediated plasticity |journal=Nature Materials |date=September 2020 |volume=19 |issue=9 |pages=969–973 |doi=10.1038/s41563-020-0703-y |pmid=32541935 |bibcode=2020NatMa..19..969K |s2cid=219691307 |language=en |issn=1476-4660|url=https://research.tue.nl/nl/publications/ad3d2f99-23e6-4072-934d-2b058d800e42 }} * University press release: {{cite news |title=Researchers develop artificial synapse that works with living cells |url=https://medicalxpress.com/news/2020-06-artificial-synapse-cells.html |access-date=23 September 2022 |work=Stanford University via medicalxpress.com |language=en}}</ref> and release [[dopamine]] ([[neurotransmitter|chemical signals]] rather than electrical signals) and communicate with natural rat [[soft robot|muscle]] and [[brain cell]]s, with potential for use in [[brain–computer interface|BCIs]]/[[Wetware computer#Future applications|prosthetics]].<ref>{{cite news |title=Artificial neuron swaps dopamine with rat brain cells like a real one |url=https://www.newscientist.com/article/2332554-artificial-neuron-swaps-dopamine-with-rat-brain-cells-like-a-real-one/ |access-date=16 September 2022 |work=New Scientist}}</ref><ref>{{cite journal |last1=Wang |first1=Ting |last2=Wang |first2=Ming |last3=Wang |first3=Jianwu |last4=Yang |first4=Le |last5=Ren |first5=Xueyang |last6=Song |first6=Gang |last7=Chen |first7=Shisheng |last8=Yuan |first8=Yuehui |last9=Liu |first9=Ruiqing |last10=Pan |first10=Liang |last11=Li |first11=Zheng |last12=Leow |first12=Wan Ru |last13=Luo |first13=Yifei |last14=Ji |first14=Shaobo |last15=Cui |first15=Zequn |last16=He |first16=Ke |last17=Zhang |first17=Feilong |last18=Lv |first18=Fengting |last19=Tian |first19=Yuanyuan |last20=Cai |first20=Kaiyu |last21=Yang |first21=Bowen |last22=Niu |first22=Jingyi |last23=Zou |first23=Haochen |last24=Liu |first24=Songrui |last25=Xu |first25=Guoliang |last26=Fan |first26=Xing |last27=Hu |first27=Benhui |last28=Loh |first28=Xian Jun |last29=Wang |first29=Lianhui |last30=Chen |first30=Xiaodong |title=A chemically mediated artificial neuron |journal=Nature Electronics |date=8 August 2022 |volume=5 |issue=9 |pages=586–595 |doi=10.1038/s41928-022-00803-0 |hdl=10356/163240 |s2cid=251464760 |url=https://www.researchgate.net/publication/362561968 |language=en |issn=2520-1131|url-access=subscription|hdl-access=free }}</ref> Low-power biocompatible [[memristor]]s may enable construction of artificial neurons which function at voltages of biological [[action potential]]s and could be used to directly process [[Biosensor|biosensing signals]], for [[neuromorphic computing]] and/or [[brain–computer interface|direct communication with biological neurons]].<ref>{{cite news |title=Scientists create tiny devices that work like the human brain |url=https://www.independent.co.uk/life-style/gadgets-and-tech/news/brain-computing-memory-artificial-synapse-memristor-a9473671.html |access-date=May 17, 2020 |work=The Independent |date=April 20, 2020 |language=en |archive-date=April 24, 2020 |archive-url=https://web.archive.org/web/20200424110621/https://www.independent.co.uk/life-style/gadgets-and-tech/news/brain-computing-memory-artificial-synapse-memristor-a9473671.html |url-status=live }}</ref><ref>{{cite news |title=Researchers unveil electronics that mimic the human brain in efficient learning |url=https://phys.org/news/2020-04-unveil-electronics-mimic-human-brain.html |access-date=May 17, 2020 |work=phys.org |language=en |archive-date=May 28, 2020 |archive-url=https://web.archive.org/web/20200528112833/https://phys.org/news/2020-04-unveil-electronics-mimic-human-brain.html |url-status=live }}</ref><ref>{{cite journal |last1=Fu |first1=Tianda |last2=Liu |first2=Xiaomeng |last3=Gao |first3=Hongyan |last4=Ward |first4=Joy E. |last5=Liu |first5=Xiaorong |last6=Yin |first6=Bing |last7=Wang |first7=Zhongrui |last8=Zhuo |first8=Ye |last9=Walker |first9=David J. F. |last10=Joshua Yang |first10=J. |last11=Chen |first11=Jianhan |last12=Lovley |first12=Derek R. |last13=Yao |first13=Jun |title=Bioinspired bio-voltage memristors |journal=Nature Communications |date=April 20, 2020 |volume=11 |issue=1 |page=1861 |doi=10.1038/s41467-020-15759-y |pmid=32313096 |pmc=7171104 |bibcode=2020NatCo..11.1861F |doi-access=free }}</ref> Organic neuromorphic circuits made out of [[polymer]]s, coated with an ion-rich gel to enable a material to carry an electric charge like [[neuron|real neurons]], have been built into a robot, enabling it to learn sensorimotorically within the real world, rather than via simulations or virtually.<ref name="sciame">{{cite news |last1=Bolakhe |first1=Saugat |title=Lego Robot with an Organic 'Brain' Learns to Navigate a Maze |url=https://www.scientificamerican.com/article/lego-robot-with-an-organic-brain-learns-to-navigate-a-maze/ |access-date=1 February 2022 |work=Scientific American |language=en}}</ref><ref>{{cite journal |last1=Krauhausen |first1=Imke |last2=Koutsouras |first2=Dimitrios A. |last3=Melianas |first3=Armantas |last4=Keene |first4=Scott T. |last5=Lieberth |first5=Katharina |last6=Ledanseur |first6=Hadrien |last7=Sheelamanthula |first7=Rajendar |last8=Giovannitti |first8=Alexander |last9=Torricelli |first9=Fabrizio |last10=Mcculloch |first10=Iain |last11=Blom |first11=Paul W. M. |last12=Salleo |first12=Alberto |last13=Burgt |first13=Yoeri van de |last14=Gkoupidenis |first14=Paschalis |title=Organic neuromorphic electronics for sensorimotor integration and learning in robotics |journal=Science Advances |date=December 2021 |volume=7 |issue=50 |pages=eabl5068 |doi=10.1126/sciadv.abl5068 |pmid=34890232 |pmc=8664264 |bibcode=2021SciA....7.5068K |hdl=10754/673986 |s2cid=245046482 |language=EN}}</ref> Moreover, artificial spiking neurons made of soft matter (polymers) can operate in biologically relevant environments and enable the synergetic communication between the artificial and biological domains.<ref>{{cite journal |last1=Sarkar |first1=Tanmoy |last2=Lieberth |first2=Katharina |last3=Pavlou |first3=Aristea |last4=Frank |first4=Thomas |last5=Mailaender |first5=Volker |last6=McCulloch |first6=Iain |last7=Blom |first7=Paul W. M. |last8=Torriccelli |first8=Fabrizio |last9=Gkoupidenis |first9=Paschalis |title=An organic artificial spiking neuron for in situ neuromorphic sensing and biointerfacing |journal=Nature Electronics |date=7 November 2022 |volume=5 |issue=11 |pages=774–783 |doi=10.1038/s41928-022-00859-y |s2cid=253413801 |language=en |issn=2520-1131|doi-access=free |hdl=10754/686016 |hdl-access=free }}</ref><ref>{{cite journal |title=Artificial neurons emulate biological counterparts to enable synergetic operation |journal=Nature Electronics |date=10 November 2022 |volume=5 |issue=11 |pages=721–722 |doi=10.1038/s41928-022-00862-3 |s2cid=253469402 |url=https://www.nature.com/articles/s41928-022-00862-3 |language=en |issn=2520-1131|url-access=subscription }}</ref> ==History== The first artificial neuron was the Threshold Logic Unit (TLU), or Linear Threshold Unit,<ref name="Anthony2001">{{cite book|author=Martin Anthony|title=Discrete Mathematics of Neural Networks: Selected Topics|url=https://books.google.com/books?id=qOy4yLBqhFcC&pg=PA3|date=January 2001|publisher=SIAM|isbn=978-0-89871-480-7|pages=3–}}</ref> first proposed by [[Warren McCulloch]] and [[Walter Pitts]] in 1943 in ''[[A Logical Calculus of the Ideas Immanent in Nervous Activity|A logical calculus of the ideas immanent in nervous activity]]''. The model was specifically targeted as a computational model of the "nerve net" in the brain.<ref name="Aggarwal2014">{{cite book|author=Charu C. Aggarwal|title=Data Classification: Algorithms and Applications|url=https://books.google.com/books?id=gJhBBAAAQBAJ&pg=PA209|date=25 July 2014|publisher=CRC Press|isbn=978-1-4665-8674-1|pages=209–}}</ref> As an activation function, it employed a threshold, equivalent to using the [[Heaviside step function]]. Initially, only a simple model was considered, with binary inputs and outputs, some restrictions on the possible weights, and a more flexible threshold value. Since the beginning it was already noticed that any [[Boolean function]] could be implemented by networks of such devices, what is easily seen from the fact that one can implement the AND and OR functions, and use them in the [[disjunctive normal form|disjunctive]] or the [[conjunctive normal form]]. Researchers also soon realized that cyclic networks, with [[feedback]]s through neurons, could define dynamical systems with memory, but most of the research concentrated (and still does) on strictly [[feed-forward network]]s because of the smaller difficulty they present. One important and pioneering artificial neural network that used the linear threshold function was the [[perceptron]], developed by [[Frank Rosenblatt]]. This model already considered more flexible weight values in the neurons, and was used in machines with adaptive capabilities. The representation of the threshold values as a bias term was introduced by [[Bernard Widrow]] in 1960 – see [[ADALINE]]. In the late 1980s, when research on neural networks regained strength, neurons with more continuous shapes started to be considered. The possibility of differentiating the activation function allows the direct use of the [[gradient descent]] and other optimization algorithms for the adjustment of the weights. Neural networks also started to be used as a general [[function approximation]] model. The best known training algorithm called [[backpropagation]] has been rediscovered several times but its first development goes back to the work of [[Paul Werbos]].<ref>[[Paul Werbos]], Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974</ref><ref>{{cite journal | last=Werbos | first=P.J. |author-link=Paul Werbos| title=Backpropagation through time: what it does and how to do it | journal=Proceedings of the IEEE | volume=78 | issue=10 | year=1990 | issn=0018-9219 | doi=10.1109/5.58337 | pages=1550–1560| s2cid=18470994 | url=https://zenodo.org/record/1262035 }}</ref> ==Types of activation function== {{Main|Activation function}} The activation function of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any [[multilayer perceptron]] using a linear activation function has an equivalent single-layer network; a ''non''-linear function is therefore necessary to gain the advantages of a multi-layer network.{{Citation needed|date=May 2018}} Below, <math>u</math> refers in all cases to the weighted sum of all the inputs to the neuron, i.e. for <math>n</math> inputs, : <math>u = \sum_{i=1}^n w_i x_i</math> where <math>w</math> is a vector of synaptic weights and <math>x</math> is a vector of inputs. ===Step function=== {{Main|Step function}} The output <math>y</math> of this activation function is binary, depending on whether the input meets a specified threshold, <math>\theta</math> (theta). The "signal" is sent, i.e. the output is set to 1, if the activation meets or exceeds the threshold. : <math>y = \begin{cases} 1 & \text{if }u \ge \theta \\ 0 & \text{if }u < \theta \end{cases}</math> This function is used in [[perceptron]]s, and appears in many other models. It performs a division of the [[Vector space|space]] of inputs by a [[hyperplane]]. It is specially useful in the last layer of a network, intended for example to perform binary classification of the inputs. ===Linear combination=== {{Main|Linear combination}} In this case, the output unit is simply the weighted sum of its inputs, plus a bias term. A number of such linear neurons perform a linear transformation of the input vector. This is usually more useful in the early layers of a network. A number of analysis tools exist based on linear models, such as [[harmonic analysis]], and they can all be used in neural networks with this linear neuron. The bias term allows us to make [[homogeneous coordinates|affine transformations]] to the data. ===Sigmoid=== {{Main|Sigmoid function}} A fairly simple nonlinear function, the [[sigmoid function]] such as the logistic function also has an easily calculated derivative, which can be important when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimize the computational load of their simulations. It was previously commonly seen in [[multilayer perceptron]]s. However, recent work has shown sigmoid neurons to be less effective than [[Rectifier (neural networks)|rectified linear]] neurons. The reason is that the gradients computed by the [[backpropagation]] algorithm tend to diminish towards zero as activations propagate through layers of sigmoidal neurons, making it difficult to optimize neural networks using multiple layers of sigmoidal neurons.<!-- This part of the article needs to be expanded --> ===Rectifier=== {{Main|Rectifier (neural networks)}} In the context of [[artificial neural network]]s, the '''rectifier''' or '''ReLU (Rectified Linear Unit)''' is an [[activation function]] defined as the positive part of its argument: : <math>f(x) = x^+ = \max(0, x),</math> where <math>x</math> is the input to a neuron. This is also known as a [[ramp function]] and is analogous to [[half-wave rectification]] in electrical engineering. This [[activation function]] was first introduced to a dynamical network by Hahnloser et al. in a 2000 paper in ''[[Nature (journal)|Nature]]''<ref name="Hahnloser2000">{{cite journal | last1=Hahnloser | first1=Richard H. R. | last2=Sarpeshkar | first2=Rahul | last3=Mahowald | first3=Misha A. | last4=Douglas | first4=Rodney J. | last5=Seung | first5=H. Sebastian | title=Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit | journal=Nature | volume=405 | issue=6789 | year=2000 | issn=0028-0836 | doi=10.1038/35016072 | pmid=10879535 | pages=947–951| bibcode=2000Natur.405..947H | s2cid=4399014 }}</ref> with strong [[biological]] motivations and mathematical justifications.<ref name="Hahnloser2001">{{cite conference |author=R Hahnloser |author2=H.S. Seung |year=2001 |title=Permitted and Forbidden Sets in Symmetric Threshold-Linear Networks|conference=NIPS 2001}}</ref> It has been demonstrated for the first time in 2011 to enable better training of deeper networks,<ref name="glorot2011">{{cite conference |author1=Xavier Glorot |author2=Antoine Bordes |author3=[[Yoshua Bengio]] |year=2011 |title=Deep sparse rectifier neural networks |conference=AISTATS |url=http://jmlr.org/proceedings/papers/v15/glorot11a/glorot11a.pdf}}</ref> compared to the widely used activation functions prior to 2011, i.e., the [[Logistic function|logistic sigmoid]] (which is inspired by [[probability theory]]; see [[logistic regression]]) and its more practical<ref>{{cite encyclopedia |author=[[Yann LeCun]] |author2=[[Leon Bottou]] |author3=Genevieve B. Orr |author4=[[Klaus-Robert Müller]] |year=1998 |url=http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf |title=Efficient BackProp |editor=G. Orr |editor2=K. Müller |encyclopedia=Neural Networks: Tricks of the Trade |publisher=Springer}}</ref> counterpart, the [[hyperbolic tangent]]. A commonly used variant of the ReLU activation function is the Leaky ReLU which allows a small, positive gradient when the unit is not active: <math>f(x) = \begin{cases} x & \text{if } x > 0, \\ ax & \text{otherwise}. \end{cases}</math> where <math>x</math> is the input to the neuron and <math>a</math> is a small positive constant (set to 0.01 in the original paper).<ref name="maas2014">Andrew L. Maas, Awni Y. Hannun, Andrew Y. Ng (2014). [https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf Rectifier Nonlinearities Improve Neural Network Acoustic Models].</ref> ==Pseudocode algorithm== The following is a simple [[pseudocode]] implementation{{Citation needed|date=September 2024}} of a single Threshold Logic Unit (TLU) which takes [[Boolean data type|Boolean]] inputs (true or false), and returns a single Boolean output when activated. An [[object oriented|object-oriented]] model is used. No method of training is defined, since several exist. If a purely functional model were used, the class TLU below would be replaced with a function TLU with input parameters threshold, weights, and inputs that returned a Boolean value. '''class''' TLU '''defined as:''' '''data member''' threshold ''':''' number '''data member''' weights ''': list of''' numbers '''of size''' X '''function member''' fire(inputs ''': list of''' booleans '''of size''' X) ''':''' boolean '''defined as:''' '''variable''' T ''':''' number T '''←''' 0 '''for each''' i '''in''' 1 '''to''' X '''do''' '''if''' inputs(i) '''is''' true '''then''' T '''←''' T + weights(i) '''end if''' '''end for each''' '''if''' T > threshold '''then''' '''return''' true '''else:''' '''return''' false '''end if''' '''end function''' '''end class''' == See also == * [[Binding neuron]] * [[Connectionism]] == References == {{reflist}} == Further reading == {{refbegin}} * {{cite journal | last1=McCulloch | first1=Warren S. |author-link=Warren McCulloch| last2=Pitts | first2=Walter |author-link2=Walter Pitts| title=A logical calculus of the ideas immanent in nervous activity | journal=Bulletin of Mathematical Biophysics | volume=5 | issue=4 | year=1943 | doi=10.1007/bf02478259 | pages=115–133}} * {{cite journal | last1=Samardak | first1=A. | last2=Nogaret | first2=A. | last3=Janson | first3=N. B. | last4=Balanov | first4=A. G. | last5=Farrer | first5=I. | last6=Ritchie | first6=D. A. | title=Noise-Controlled Signal Transmission in a Multithread Semiconductor Neuron | journal=Physical Review Letters | volume=102 | issue=22 | date=2009-06-05 | doi=10.1103/physrevlett.102.226802 | page=226802| pmid=19658886 | bibcode=2009PhRvL.102v6802S | s2cid=11211062 | url=https://dspace.lboro.ac.uk/2134/12736 }} {{refend}} == External links == * [https://www.youtube.com/watch?v=NhTZnnJJP64 {{sic|nolink=y|Artifical}} neuron mimicks function of human cells] * [http://www.mind.ilstu.edu/curriculum/modOverview.php?modGUI=212 McCulloch-Pitts Neurons (Overview)] [[Category:Artificial neural networks]] [[Category:American inventions]] [[Category:Bioinspiration]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation needed
(
edit
)
Template:Cite arXiv
(
edit
)
Template:Cite book
(
edit
)
Template:Cite conference
(
edit
)
Template:Cite encyclopedia
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite news
(
edit
)
Template:Main
(
edit
)
Template:Refbegin
(
edit
)
Template:Refend
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Sic
(
edit
)
Template:Vanchor
(
edit
)