Editing Perceptron (section)

==History==
[[File:Mark I perceptron.jpeg|thumb|Mark I Perceptron machine, the first implementation of the perceptron algorithm. It was connected to a camera with 20×20 [[cadmium sulfide]] [[photocell]]s to make a 400-pixel image. The main visible feature is the sensory-to-association plugboard, which sets different combinations of input features. To the right are arrays of [[potentiometer]]s that implemented the adaptive weights.{{r|bishop}}{{rp|213}}|alt=]]
{{see also|History of artificial intelligence#Perceptrons}}
[[File:330-PSA-80-60 (USN 710739) (20897323365).jpg|thumb|The Mark 1 Perceptron, being adjusted by Charles Wightman (Mark I Perceptron project engineer).<ref>{{Cite book |last=Hecht-Nielsen |first=Robert |title=Neurocomputing |date=1991 |publisher=Addison-Wesley |isbn=978-0-201-09355-1 |edition=Reprint. with corrections |location=Reading (Mass.) Menlo Park (Calif.) New York [etc.] |at=p. 6, Figure 1.3 caption.}}</ref> Sensory units at left, association units in center, and control panel and response units at far right. The sensory-to-association plugboard is behind the closed panel to the right of the operator. The letter "C" on the front panel is a display of the current state of the sensory input.<ref>{{Cite journal |last=Block |first=H. D. |date=1962-01-01 |title=The Perceptron: A Model for Brain Functioning. I |url=https://link.aps.org/doi/10.1103/RevModPhys.34.123 |journal=Reviews of Modern Physics |language=en |volume=34 |issue=1 |pages=123–135 |doi=10.1103/RevModPhys.34.123 |bibcode=1962RvMP...34..123B |issn=0034-6861|url-access=subscription }}</ref>]]
The artificial neuron network was invented in 1943 by [[Warren McCulloch]] and [[Walter Pitts]] in ''[[A Logical Calculus of the Ideas Immanent in Nervous Activity|A logical calculus of the ideas immanent in nervous activity]]''.<ref>{{cite journal |last1=McCulloch |first1=W |last2=Pitts |first2=W |title=A Logical Calculus of Ideas Immanent in Nervous Activity |journal=Bulletin of Mathematical Biophysics |date=1943 |volume=5 |issue=4 |pages=115–133 |doi=10.1007/BF02478259 |url=https://www.bibsonomy.org/bibtex/13e8e0d06f376f3eb95af89d5a2f15957/schaul|url-access=subscription }}</ref> 

In 1957, [[Frank Rosenblatt]] was at the [[Cornell Aeronautical Laboratory]]. He simulated the perceptron on an [[IBM 704]].<ref name=":5">{{cite journal |last=Rosenblatt |first=Frank |year=1957 |title=The Perceptron—a perceiving and recognizing automaton |url=https://bpb-us-e2.wpmucdn.com/websites.umass.edu/dist/a/27637/files/2016/03/rosenblatt-1957.pdf |journal=Report 85-460-1 |publisher=Cornell Aeronautical Laboratory}}</ref><ref>{{Cite journal |last=Rosenblatt |first=Frank |date=March 1960 |title=Perceptron Simulation Experiments |url=https://ieeexplore.ieee.org/document/4066017 |journal=Proceedings of the IRE |volume=48 |issue=3 |pages=301–309 |doi=10.1109/JRPROC.1960.287598 |issn=0096-8390|url-access=subscription }}</ref> Later, he obtained funding by the Information Systems Branch of the United States [[Office of Naval Research]] and the [[Rome Air Development Center]], to build a custom-made computer, the [[Mark I Perceptron]]. It was first publicly demonstrated on 23 June 1960.<ref name=":0" /> The machine was "part of a previously secret four-year NPIC [the US' [[National Photographic Interpretation Center]]] effort from 1963 through 1966 to develop this algorithm into a useful tool for photo-interpreters".<ref name=":1">{{Cite journal |last=O’Connor |first=Jack |date=2022-06-21 |title=Undercover Algorithm: A Secret Chapter in the Early History of Artificial Intelligence and Satellite Imagery |url=https://www.tandfonline.com/doi/full/10.1080/08850607.2022.2073542 |journal=International Journal of Intelligence and CounterIntelligence |language=en |pages=1–15 |doi=10.1080/08850607.2022.2073542 |issn=0885-0607 |s2cid=249946000|url-access=subscription }}</ref>

Rosenblatt described the details of the perceptron in a 1958 paper.<ref>{{Cite journal |last=Rosenblatt |first=F. |date=1958 |title=The perceptron: A probabilistic model for information storage and organization in the brain. |url=http://dx.doi.org/10.1037/h0042519 |journal=Psychological Review |volume=65 |issue=6 |pages=386–408 |doi=10.1037/h0042519 |pmid=13602029 |issn=1939-1471|url-access=subscription }}</ref> His organization of a perceptron is constructed of three kinds of cells ("units"): AI, AII, R, which stand for "[[Projection areas|projection]]", "association" and "response". He presented at the first international symposium on AI, ''Mechanisation of Thought Processes'', which took place in 1958 November.<ref>Frank Rosenblatt, ‘''Two Theorems of Statistical Separability in the Perceptron''’, Symposium on the Mechanization of Thought, National Physical Laboratory, Teddington, UK, November 1958, vol. 1, H. M. Stationery Office, London, 1959.</ref>

Rosenblatt's project was funded under Contract Nonr-401(40) "Cognitive Systems Research Program", which lasted from 1959 to 1970,<ref>Rosenblatt, Frank, and CORNELL UNIV ITHACA NY. [https://apps.dtic.mil/sti/citations/trecms/AD0720416 ''Cognitive Systems Research Program''.] Technical report, Cornell University, 72, 1971.</ref> and Contract Nonr-2381(00) "Project PARA" ("PARA" means "Perceiving and Recognition Automata"), which lasted from 1957<ref name=":5" /> to 1963.<ref>Muerle, John Ludwig, and CORNELL AERONAUTICAL LAB INC BUFFALO NY. ''[https://apps.dtic.mil/sti/citations/tr/AD0633137 Project Para, Perceiving and Recognition Automata]''. Cornell Aeronautical Laboratory, Incorporated, 1963.</ref>

In 1959, the Institute for Defense Analysis awarded his group a $10,000 contract. By September 1961, the ONR awarded further $153,000 worth of contracts, with $108,000 committed for 1962.<ref>{{Cite thesis |last=Penn |first=Jonathan |title=Inventing Intelligence: On the History of Complex Information Processing and Artificial Intelligence in the United States in the Mid-Twentieth Century |date=2021-01-11 |publisher=[object Object] |url=https://www.repository.cam.ac.uk/handle/1810/315976 |doi=10.17863/cam.63087 |language=en}}</ref>

The ONR research manager, Marvin Denicoff, stated that ONR, instead of [[DARPA|ARPA]], funded the Perceptron project, because the project was unlikely to produce technological results in the near or medium term. Funding from ARPA go up to the order of millions dollars, while from ONR are on the order of 10,000 dollars. Meanwhile, the head of [[Information Processing Techniques Office|IPTO]] at ARPA, [[J.C.R. Licklider]], was interested in 'self-organizing', 'adaptive' and other biologically-inspired methods in the 1950s; but by the mid-1960s he was openly critical of these, including the perceptron. Instead he strongly favored the logical AI approach of [[Herbert A. Simon|Simon]] and [[Allen Newell|Newell]].<ref>{{Cite journal |last=Guice |first=Jon |date=1998 |title=Controversy and the State: Lord ARPA and Intelligent Computing |url=https://www.jstor.org/stable/285752 |journal=Social Studies of Science |volume=28 |issue=1 |pages=103–138 |doi=10.1177/030631298028001004 |jstor=285752 |pmid=11619937 |issn=0306-3127|url-access=subscription }}</ref>

=== Mark I Perceptron machine ===
{{Main article|Mark I Perceptron}}
[[File:Organization_of_a_biological_brain_and_a_perceptron.png|thumb|281x281px|Organization of a biological brain and a perceptron.]]
The perceptron was intended to be a machine, rather than a program, and while its first implementation was in software for the [[IBM 704]], it was subsequently implemented in custom-built hardware as the [[Mark I Perceptron]] with the project name "Project PARA",<ref name=":6" /> designed for [[image recognition]]. The machine is currently in [[National Museum of American History|Smithsonian National Museum of American History]].<ref>{{Cite web |title=Perceptron, Mark I |url=https://americanhistory.si.edu/collections/search/object/nmah_334414 |access-date=2023-10-30 |website=National Museum of American History |language=en}}</ref>

The Mark I Perceptron had three layers. One version was implemented as follows:

* An array of 400 [[photocell]]s arranged in a 20x20 grid, named "sensory units" (S-units), or "input retina". Each S-unit can connect to up to 40 A-units.
* A hidden layer of 512 perceptrons, named "association units" (A-units).
* An output layer of eight perceptrons, named "response units" (R-units).
Rosenblatt called this three-layered perceptron network the ''alpha-perceptron'', to distinguish it from other perceptron models he experimented with.<ref name=":0">{{Cite book |last=Nilsson |first=Nils J. |url=https://www.cambridge.org/core/books/quest-for-artificial-intelligence/32C727961B24223BBB1B3511F44F343E |title=The Quest for Artificial Intelligence |date=2009 |publisher=Cambridge University Press |isbn=978-0-521-11639-8 |location=Cambridge |chapter=4.2.1. Perceptrons}}</ref>

The S-units are connected to the A-units randomly (according to a table of random numbers) via a plugboard (see photo), to "eliminate any particular intentional bias in the perceptron". The connection weights are fixed, not learned. Rosenblatt was adamant about the random connections, as he believed the retina was randomly connected to the visual cortex, and he wanted his perceptron machine to resemble human visual perception.<ref>{{Cite book |url=https://direct.mit.edu/books/book/4886/Talking-NetsAn-Oral-History-of-Neural-Networks |title=Talking Nets: An Oral History of Neural Networks |date=2000 |publisher=The MIT Press |isbn=978-0-262-26715-1 |editor-last=Anderson |editor-first=James A. |language=en |doi=10.7551/mitpress/6626.003.0004 |editor-last2=Rosenfeld |editor-first2=Edward}}</ref>

The A-units are connected to the R-units, with adjustable weights encoded in [[potentiometer]]s, and weight updates during learning were performed by electric motors.<ref name="bishop">{{cite book |last=Bishop |first=Christopher M. |title=Pattern Recognition and Machine Learning |publisher=Springer |year=2006 |isbn=0-387-31073-8}}</ref>{{rp|193}}The hardware details are in an operators' manual.<ref name=":6">{{Cite book |last=Hay |first=John Cameron |url=https://apps.dtic.mil/sti/tr/pdf/AD0236965.pdf |title=Mark I perceptron operators' manual (Project PARA) / |date=1960 |publisher=Cornell Aeronautical Laboratory |location=Buffalo |archive-url=https://web.archive.org/web/20231027213510/https://apps.dtic.mil/sti/tr/pdf/AD0236965.pdf |archive-date=2023-10-27 }}</ref>
[[File:Mark I Perceptron, Figure 2 of operator's manual.png|thumb|Components of the Mark I Perceptron. From the operator's manual.<ref name=":6" />]]

In a 1958 press conference organized by the US Navy, Rosenblatt made statements about the perceptron that caused a heated controversy among the fledgling [[Artificial intelligence|AI]] community; based on Rosenblatt's statements, ''[[The New York Times]]'' reported the perceptron to be "the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence."<ref name="Olazaran">{{cite journal |last=Olazaran |first=Mikel |year=1996 |title=A Sociological Study of the Official History of the Perceptrons Controversy |journal=Social Studies of Science |volume=26 |issue=3 |pages=611–659 |doi=10.1177/030631296026003005 |jstor=285702 |s2cid=16786738}}</ref>

The Photo Division of [[Central Intelligence Agency]], from 1960 to 1964, studied the use of Mark I Perceptron machine for recognizing militarily interesting silhouetted targets (such as planes and ships) in [[Aerial photography|aerial photos]].<ref>{{Cite web |title=Perception Concepts to Photo-Interpretation |url=https://www.cia.gov/readingroom/document/cia-rdp78b04770a002300030027-6 |access-date=2024-11-14 |website=www.cia.gov}}</ref><ref>{{Cite journal |last=Irwin |first=Julia A. |date=2024-09-11 |title=Artificial Worlds and Perceptronic Objects: The CIA's Mid-century Automatic Target Recognition |url=https://direct.mit.edu/grey/article/doi/10.1162/grey_a_00415/124337/Artificial-Worlds-and-Perceptronic-Objects-The-CIA |journal=Grey Room |language=en |issue=97 |pages=6–35 |doi=10.1162/grey_a_00415 |issn=1526-3819|url-access=subscription }}</ref>

=== ''Principles of Neurodynamics'' (1962) ===
Rosenblatt described his experiments with many variants of the Perceptron machine in a book ''Principles of Neurodynamics'' (1962). The book is a published version of the 1961 report.<ref>''[[iarchive:DTIC AD0256582/|Principles of neurodynamics: Perceptrons and the theory of brain mechanisms]]'', by Frank Rosenblatt, Report Number VG-1196-G-8, Cornell Aeronautical Laboratory, published on 15 March 1961. The work reported in this volume has been carried out under Contract Nonr-2381 (00) (Project PARA) at C.A.L. and Contract Nonr-401(40), at Cornell Univensity.</ref>

Among the variants are:

* "cross-coupling" (connections between units within the same layer) with possibly closed loops, 
* "back-coupling" (connections from units in a later layer to units in a previous layer), 
* four-layer perceptrons where the last two layers have adjustible weights (and thus a proper multilayer perceptron),
* incorporating time-delays to perceptron units, to allow for processing sequential data,
* analyzing audio (instead of images).
The machine was shipped from Cornell to Smithsonian in 1967, under a government transfer administered by the Office of Naval Research.<ref name=":1" />

=== ''Perceptrons'' (1969) ===
{{Main|Perceptrons (book)}}
Although the perceptron initially seemed promising, it was quickly proved that perceptrons could not be trained to recognise many classes of patterns. This caused the field of [[neural network (machine learning)|neural network]] research to stagnate for many years, before it was recognised that a [[feedforward neural network]] with two or more layers (also called a [[multilayer perceptron]]) had greater processing power than perceptrons with one layer (also called a [[Feedforward neural network#A threshold (e.g. activation function) added|single-layer perceptron]]).

Single-layer perceptrons are only capable of learning [[linearly separable]] patterns.<ref name="Sejnowski">{{Cite book |last=Sejnowski |first=Terrence J.|author-link=Terry Sejnowski|url=https://books.google.com/books?id=9xZxDwAAQBAJ |title=The Deep Learning Revolution |date=2018|publisher=MIT Press |isbn=978-0-262-03803-4 |language=en|page=47}}</ref> For a classification task with some step activation function, a single node will have a single line dividing the data points forming the patterns. More nodes can create more dividing lines, but those lines must somehow be combined to form more complex classifications. A second layer of perceptrons, or even linear nodes, are sufficient to solve many otherwise non-separable problems.

In 1969, a famous book entitled ''[[Perceptrons (book)|Perceptrons]]'' by [[Marvin Minsky]] and [[Seymour Papert]] showed that it was impossible for these classes of network to learn an [[XOR]] function. It is often incorrectly believed that they also conjectured that a similar result would hold for a multi-layer perceptron network. However, this is not true, as both Minsky and Papert already knew that multi-layer perceptrons were capable of producing an XOR function. (See the page on ''[[Perceptrons (book)]]'' for more information.) Nevertheless, the often-miscited Minsky and Papert text caused a significant decline in interest and funding of neural network research. It took ten more years until neural network research experienced a resurgence in the 1980s.<ref name="Sejnowski"/>{{Verify source|date=October 2024|reason=Does the source support all of the preceding text and is "often incorrectly believed" true today or was it only true in the past?}} This text was reprinted in 1987 as "Perceptrons - Expanded Edition" where some errors in the original text are shown and corrected.

=== Subsequent work ===
Rosenblatt continued working on perceptrons despite diminishing funding. The last attempt was Tobermory, built between 1961 and 1967, built for speech recognition.<ref>Rosenblatt, Frank (1962). “''[https://web.archive.org/web/20231230210135/https://apps.dtic.mil/sti/tr/pdf/AD0420696.pdf#page=163 A Description of the Tobermory Perceptron]''.” Cognitive Research Program. Report No. 4. Collected Technical Papers, Vol. 2. Edited by Frank Rosenblatt. Ithaca, NY: Cornell University.</ref> It occupied an entire room.<ref name=":7">Nagy, George. 1963. ''[https://web.archive.org/web/20231230204827/https://apps.dtic.mil/sti/trecms/pdf/AD0607459.pdf System and circuit designs for the Tobermory perceptron]''. Technical report number 5, Cognitive Systems Research Program, Cornell University, Ithaca New York.</ref> It had 4 layers with 12,000 weights implemented by toroidal [[magnetic core]]s. By the time of its completion, simulation on digital computers had become faster than purpose-built perceptron machines.<ref>Nagy, George. "Neural networks-then and now." ''IEEE Transactions on Neural Networks'' 2.2 (1991): 316-318.</ref> He died in a boating accident in 1971.
[[File:Isometric view of Tobermory Phase I.png|thumb|Isometric view of Tobermory Phase I.<ref name=":7" />]]
The [[kernel perceptron]] algorithm was already introduced in 1964 by Aizerman et al.<ref>{{cite journal |last1=Aizerman |first1=M. A. |last2=Braverman |first2=E. M. |last3=Rozonoer |first3=L. I. |year=1964 |title=Theoretical foundations of the potential function method in pattern recognition learning |journal=Automation and Remote Control |volume=25 |pages=821–837 }}</ref> Margin bounds guarantees were given for the Perceptron algorithm in the general non-separable case first by [[Yoav Freund|Freund]] and [[Robert Schapire|Schapire]] (1998),<ref name="largemargin">{{Cite journal |doi=10.1023/A:1007662407062 |year=1999 |title=Large margin classification using the perceptron algorithm |last1=Freund |first1=Y. |author-link1=Yoav Freund |journal=[[Machine Learning (journal)|Machine Learning]] |volume=37 |issue=3 |pages=277–296 |last2=Schapire |first2=R. E. |s2cid=5885617 |author-link2=Robert Schapire |url=http://cseweb.ucsd.edu/~yfreund/papers/LargeMarginsUsingPerceptron.pdf|doi-access=free }}</ref> and more recently by [[Mehryar Mohri|Mohri]] and Rostamizadeh (2013) who extend previous results and give new and more favorable L1 bounds.<ref>{{cite arXiv |last1=Mohri |first1=Mehryar |last2=Rostamizadeh |first2=Afshin |title=Perceptron Mistake Bounds |eprint=1305.0208 |year=2013 |class=cs.LG }}</ref><ref>[https://mitpress.mit.edu/books/foundations-machine-learning-second-edition] Foundations of Machine Learning, MIT Press (Chapter 8).</ref>

The perceptron is a simplified model of a biological [[neuron]]. While the complexity of [[biological neuron model]]s is often required to fully understand neural behavior, research suggests a perceptron-like linear model can produce some behavior seen in real neurons.<ref>{{cite journal |last1=Cash |first1=Sydney |first2=Rafael |last2=Yuste |title=Linear Summation of Excitatory Inputs by CA1 Pyramidal Neurons |journal=[[Neuron (journal)|Neuron]] |volume=22 |issue=2 |year=1999 |pages=383–394 |doi=10.1016/S0896-6273(00)81098-3 |pmid=10069343 |doi-access=free }}</ref>

The solution spaces of decision boundaries for all binary functions and learning behaviors are studied in.<ref>{{cite book |last1=Liou |first1=D.-R. |title=Learning Behaviors of Perceptron |last2=Liou |first2=J.-W. |last3=Liou |first3=C.-Y. |publisher=iConcept Press |year=2013 |isbn=978-1-477554-73-9}}</ref>