Editing Classical conditioning (section)

==Theories==

===Data sources===

Experiments on theoretical issues in conditioning have mostly been done on [[Vertebrate|vertebrates]], especially rats and pigeons. However, conditioning has also been studied in [[Invertebrate|invertebrates]], and very important data on the neural basis of conditioning has come from experiments on the sea slug, ''[[Aplysia]]''.<ref name="Shettleworth_2010" /> Most relevant experiments have used the classical conditioning procedure, although [[Operant conditioning|instrumental (operant) conditioning]] experiments have also been used, and the strength of classical conditioning is often measured through its operant effects, as in ''conditioned suppression'' (see Phenomena section above) and [[Shaping (psychology)|autoshaping]].

===Stimulus-substitution theory===
{{Further|Counterconditioning}}
According to Pavlov, conditioning does not involve the acquisition of any new behavior, but rather the tendency to respond in old ways to new stimuli. Thus, he theorized that the CS merely substitutes for the US in evoking the [[reflex]] response. This explanation is called the stimulus-substitution theory of conditioning.<ref name="Chance_2008" />{{rp|84}} A critical problem with the stimulus-substitution theory is that the CR and UR are not always the same. Pavlov himself observed that a dog's saliva produced as a CR differed in composition from that produced as a UR.<ref name="Pavlov"/> The CR is sometimes even the opposite of the UR. For example: the unconditional response to an electric shock is an increase in heart rate, whereas a CS that has been paired with the electric shock elicits a decrease in heart rate. (However, it has been proposed{{by whom|date=January 2019}} that only when the UR does not involve the [[central nervous system]] are the CR and the UR opposites.)

===Rescorla–Wagner model===
{{main|Rescorla–Wagner model}}

The Rescorla–Wagner (R–W) model<ref name="Bouton_2016"/><ref>{{cite book |vauthors=Rescorla RA, Wagner AR |date=1972 |chapter=A theory of Pavlovan conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. |title=Classical Conditioning II: Current Theory and Research |chapter-url=https://archive.org/details/classicalconditi0000unse |chapter-url-access=registration |veditors=Black AH, Prokasy WF |pages=[https://archive.org/details/classicalconditi0000unse/page/64 64–99] |location=New York |publisher=Appleton-Century}}</ref> is a relatively simple yet powerful model of conditioning. The model predicts a number of important phenomena, but it also fails in important ways, thus leading to a number of modifications and alternative models. However, because much of the theoretical research on conditioning in the past 40 years has been instigated by this model or reactions to it, the R–W model deserves a brief description here.<ref name="M&E">{{cite book |vauthors=Miller R, Escobar M |chapter=Learning: Laws and Models of Basic Conditioning |title=Stevens' Handbook of Experimental Psychology |edition=3rd |volume=3: Learning, Motivation & Emotion |veditors=Pashler H, Gallistel R |pages=47–102 |location=New York |publisher=Wiley |isbn=978-0-471-65016-4 |date=2004-02-05}}</ref><ref name="Chance_2008" />{{rp|85}}

The Rescorla-Wagner model argues that there is a limit to the amount of conditioning that can occur in the pairing of two stimuli. One determinant of this limit is the nature of the US. For example: pairing a bell with a juicy steak is more likely to produce salivation than pairing the bell with a piece of dry bread, and dry bread is likely to work better than a piece of cardboard. A key idea behind the R–W model is that a CS signals or predicts the US. One might say that before conditioning, the subject is surprised by the US. However, after conditioning, the subject is no longer surprised, because the CS predicts the coming of the US. (The model can be described mathematically and that words like predict, surprise, and expect are only used to help explain the model.) Here the workings of the model are illustrated with brief accounts of acquisition, extinction, and blocking. The model also predicts a number of other phenomena, see main article on the model.

====Equation====
<math display="block">\Delta V=\alpha\beta (\lambda - \Sigma V)</math>

This is the Rescorla-Wagner equation. It specifies the amount of learning that will occur on a single pairing of a conditioning stimulus (CS) with an unconditioned stimulus (US). The above equation is solved repeatedly to predict the course of learning over many such trials.

In this model, the degree of learning is measured by how well the CS predicts the US, which is given by the "associative strength" of the CS. In the equation, V represents the current associative strength of the CS, and ∆V is the change in this strength that happens on a given trial. ΣV is the sum of the strengths of all stimuli present in the situation. λ is the maximum associative strength that a given US will support; its value is usually set to 1 on trials when the US is present, and 0 when the US is absent. α and β are constants related to the salience of the CS and the speed of learning for a given US. How the equation predicts various experimental results is explained in following sections. For further details, see the main article on the model.<ref name="Chance_2008" />{{rp|85–89}}

====R–W model: acquisition====
The R–W model measures conditioning by assigning an "associative strength" to the CS and other local stimuli. Before a CS is conditioned it has an associative strength of zero. Pairing the CS and the US causes a gradual increase in the associative strength of the CS. This increase is determined by the nature of the US (e.g. its intensity).<ref name="Chance_2008" />{{rp|85–89}} The amount of learning that happens during any single CS-US pairing depends on the difference between the total associative strengths of CS and other stimuli present in the situation (ΣV in the equation), and a maximum set by the US (λ in the equation). On the first pairing of the CS and US, this difference is large and the associative strength of the CS takes a big step up. As CS-US pairings accumulate, the US becomes more predictable, and the increase in associative strength on each trial becomes smaller and smaller. Finally, the difference between the associative strength of the CS (plus any that may accrue to other stimuli) and the maximum strength reaches zero. That is, the US is fully predicted, the associative strength of the CS stops growing, and conditioning is complete.

====R–W model: extinction====
[[File:Rescorla–Wagner model in Learning.svg|thumb|Comparing the associate strength by R-W model in Learning]]
The associative process described by the R–W model also accounts for extinction (see "procedures" above). The extinction procedure starts with a positive associative strength of the CS, which means that the CS predicts that the US will occur. On an extinction trial the US fails to occur after the CS. As a result of this "surprising" outcome, the associative strength of the CS takes a step down. Extinction is complete when the strength of the CS reaches zero; no US is predicted, and no US occurs. However, if that same CS is presented without the US but accompanied by a well-established conditioned inhibitor (CI), that is, a stimulus that predicts the absence of a US (in R-W terms, a stimulus with a negative associate strength) then R-W predicts that the CS will not undergo extinction (its V will not decrease in size).

====R–W model: blocking====
{{main|Blocking effect}}
The most important and novel contribution of the R–W model is its assumption that the conditioning of a CS depends not just on that CS alone, and its relationship to the US, but also on all other stimuli present in the conditioning situation. In particular, the model states that the US is predicted by the sum of the associative strengths of all stimuli present in the conditioning situation. Learning is controlled by the difference between this total associative strength and the strength supported by the US. When this sum of strengths reaches a maximum set by the US, conditioning ends as just described.<ref name="Chance_2008" />{{rp|85–89}}

The R–W explanation of the blocking phenomenon illustrates one consequence of the assumption just stated. In blocking (see "phenomena" above), CS1 is paired with a US until conditioning is complete. Then on additional conditioning trials a second stimulus (CS2) appears together with CS1, and both are followed by the US. Finally CS2 is tested and shown to produce no response because learning about CS2 was "blocked" by the initial learning about CS1. The R–W model explains this by saying that after the initial conditioning, CS1 fully predicts the US. Since there is no difference between what is predicted and what happens, no new learning happens on the additional trials with CS1+CS2, hence CS2 later yields no response.

===Theoretical issues and alternatives to the Rescorla–Wagner model===
One of the main reasons for the importance of the R–W model is that it is relatively simple and makes clear predictions. Tests of these predictions have led to a number of important new findings and a considerably increased understanding of conditioning. Some new information has supported the theory, but much has not, and it is generally agreed that the theory is, at best, too simple. However, no single model seems to account for all the phenomena that experiments have produced.<ref name="Bouton_2016"/><ref>{{cite journal |vauthors=Miller RR, Barnet RC, Grahame NJ |title=Assessment of the Rescorla-Wagner model |journal=Psychological Bulletin |volume=117 |issue=3 |pages=363–86 |date=May 1995 |pmid=7777644 |doi=10.1037/0033-2909.117.3.363}}</ref> Following are brief summaries of some related theoretical issues.<ref name="M&E"/>

====Content of learning====
The R–W model reduces conditioning to the association of a CS and US, and measures this with a single number, the associative strength of the CS. A number of experimental findings indicate that more is learned than this. Among these are two phenomena described earlier in this article
* Latent inhibition: If a subject is repeatedly exposed to the CS before conditioning starts, then conditioning takes longer. The R–W model cannot explain this because preexposure leaves the strength of the CS unchanged at zero.
* Recovery of responding after extinction: It appears that something remains after extinction has reduced associative strength to zero because several procedures cause responding to reappear without further conditioning.<ref name="Bouton_2016"/>

====Role of attention in learning====
Latent inhibition might happen because a subject stops focusing on a CS that is seen frequently before it is paired with a US. In fact, changes in attention to the CS are at the heart of two prominent theories that try to cope with experimental results that give the R–W model difficulty. In one of these, proposed by [[Nicholas Mackintosh]],<ref>{{cite journal |vauthors=Mackintosh NJ |date=1975 |title=A theory of attention: Variations in the associability of stimuli with reinforcement |journal=Psychological Review |volume=82 |issue=4 |pages=276–298 |doi=10.1037/h0076778 |citeseerx=10.1.1.556.1688}}</ref> the speed of conditioning depends on the amount of attention devoted to the CS, and this amount of attention depends in turn on how well the CS predicts the US. Pearce and Hall proposed a related model based on a different attentional principle<ref>{{cite journal |vauthors=Pearce JM, Hall G |title=A model for Pavlovian learning: variations in the effectiveness of conditioned but not of unconditioned stimuli |journal=Psychological Review |volume=87 |issue=6 |pages=532–52 |date=November 1980 |pmid=7443916 |doi=10.1037/0033-295X.87.6.532}}</ref> Both models have been extensively tested, and neither explains all the experimental results. Consequently, various authors have attempted hybrid models that combine the two attentional processes. Pearce and Hall in 2010 integrated their attentional ideas and even suggested the possibility of incorporating the Rescorla-Wagner equation into an integrated model.<ref name="Bouton_2016"/>

====Context====
As stated earlier, a key idea in conditioning is that the CS signals or predicts the US (see "zero contingency procedure" above). However, for example, the room in which conditioning takes place also "predicts" that the US may occur. Still, the room predicts with much less certainty than does the experimental CS itself, because the room is also there between experimental trials, when the US is absent. The role of such context is illustrated by the fact that the dogs in Pavlov's experiment would sometimes start salivating as they approached the experimental apparatus, before they saw or heard any CS.<ref name="Schacter_2009"/> Such so-called "context" stimuli are always present, and their influence helps to account for some otherwise puzzling experimental findings. The associative strength of context stimuli can be entered into the Rescorla-Wagner equation, and they play an important role in the ''comparator'' and ''computational'' theories outlined below.<ref name="Bouton_2016"/>

====Comparator theory====
To find out what has been learned, we must somehow measure behavior ("performance") in a test situation. However, as students know all too well, performance in a test situation is not always a good measure of what has been learned. As for conditioning, there is evidence that subjects in a blocking experiment do learn something about the "blocked" CS, but fail to show this learning because of the way that they are usually tested.

"Comparator" theories of conditioning are "performance based", that is, they stress what is going on at the time of the test. In particular, they look at all the stimuli that are present during testing and at how the associations acquired by these stimuli may interact.<ref>{{cite book |vauthors=Gibbon J, Balsam P |date=1981 |chapter=Spreading association in time. |veditors=Locurto CM, Terrace HS, Gibbon J |title=Autoshaping and conditioning theory |pages=219–235 |location=New York |publisher=Academic Press}}</ref><ref>{{cite journal |vauthors=Miller RR, Escobar M |title=Contrasting acquisition-focused and performance-focused models of acquired behavior. |journal=Current Directions in Psychological Science |date=August 2001 |volume=10 |issue=4 |pages=141–5 |doi=10.1111/1467-8721.00135 |s2cid=7159340}}</ref> To oversimplify somewhat, comparator theories assume that during conditioning the subject acquires both CS-US and context-US associations. At the time of the test, these associations are compared, and a response to the CS occurs only if the CS-US association is stronger than the context-US association. After a CS and US are repeatedly paired in simple acquisition, the CS-US association is strong and the context-US association is relatively weak. This means that the CS elicits a strong CR. In "zero contingency" (see above), the conditioned response is weak or absent because the context-US association is about as strong as the CS-US association. Blocking and other more subtle phenomena can also be explained by comparator theories, though, again, they cannot explain everything.<ref name="Bouton_2016"/><ref name="M&E"/>

====Computational theory====
An organism's need to predict future events is central to modern theories of conditioning. Most theories use associations between stimuli to take care of these predictions. For example: In the R–W model, the associative strength of a CS tells us how strongly that CS predicts a US. A different approach to prediction is suggested by models such as that proposed by Gallistel & Gibbon (2000, 2002).<ref>{{cite journal |vauthors=Gallistel CR, Gibbon J |title=Time, rate, and conditioning |journal=Psychological Review |volume=107 |issue=2 |pages=289–344 |date=April 2000 |pmid=10789198 |doi=10.1037/0033-295X.107.2.289 |url=http://ruccs.rutgers.edu/faculty/GnG/Gal&Gib_Preprint.pdf |citeseerx=10.1.1.407.1802 |access-date=2021-08-30 |archive-date=2015-05-05 |archive-url=https://web.archive.org/web/20150505162755/http://ruccs.rutgers.edu/faculty/GnG/Gal%26Gib_Preprint.pdf |url-status=live }}</ref><ref>{{cite book |vauthors=Gallistel R, Gibbon J |date=2002 |title=The Symbolic Foundations of Conditioned Behavior |location=Mahwah, NJ |publisher=Erlbaum}}</ref> Here the response is not determined by associative strengths. Instead, the organism records the times of onset and offset of CSs and USs and uses these to calculate the probability that the US will follow the CS. A number of experiments have shown that humans and animals can learn to time events (see [[Animal cognition]]), and the Gallistel & Gibbon model yields very good quantitative fits to a variety of experimental data.<ref name="Shettleworth_2010"/><ref name="M&E"/> However, recent studies have suggested that duration-based models cannot account for some empirical findings as well as associative models.<ref>{{cite journal |vauthors=Golkar A, Bellander M, Öhman A |title=Temporal properties of fear extinction--does time matter? |journal=Behavioral Neuroscience |volume=127 |issue=1 |pages=59–69 |date=February 2013 |pmid=23231494 |doi=10.1037/a0030892}}</ref>

====Element-based models====
The Rescorla-Wagner model treats a stimulus as a single entity, and it represents the associative strength of a stimulus with one number, with no record of how that number was reached. As noted above, this makes it hard for the model to account for a number of experimental results. More flexibility is provided by assuming that a stimulus is internally represented by a collection of elements, each of which may change from one associative state to another. For example, the similarity of one stimulus to another may be represented by saying that the two stimuli share elements in common. These shared elements help to account for stimulus generalization and other phenomena that may depend upon generalization. Also, different elements within the same set may have different associations, and their activations and associations may change at different times and at different rates. This allows element-based models to handle some otherwise inexplicable results.

=====The SOP model=====
A prominent example of the element approach is the "SOP" model of Wagner.<ref>{{cite book |vauthors=Wagner AR |date=1981 |chapter=SOP: A model of automatic memory processing in animal behavior. |veditors=Spear NE, Miller RR |title=Information processing in animals: Memory mechanisms |pages=5–47 |location=Hillsdale, NJ |publisher=Erlbaum |isbn=978-1-317-75770-2}}</ref> The model has been elaborated in various ways since its introduction, and it can now account in principle for a very wide variety of experimental findings.<ref name="Bouton_2016"/> The model represents any given stimulus with a large collection of elements. The time of presentation of various stimuli, the state of their elements, and the interactions between the elements, all determine the course of associative processes and the behaviors observed during conditioning experiments.

The SOP account of simple conditioning exemplifies some essentials of the SOP model. To begin with, the model assumes that the CS and US are each represented by a large group of elements. Each of these stimulus elements can be in one of three states: 
* primary activity (A1) - Roughly speaking, the stimulus is "attended to." (References to "attention" are intended only to aid understanding and are not part of the model.)
* secondary activity (A2) - The stimulus is "peripherally attended to."
* inactive (I) – The stimulus is "not attended to."
Of the elements that represent a single stimulus at a given moment, some may be in state A1, some in state A2, and some in state I.

When a stimulus first appears, some of its elements jump from inactivity I to primary activity A1. From the A1 state they gradually decay to A2, and finally back to I. Element activity can only change in this way; in particular, elements in A2 cannot go directly back to A1. If the elements of both the CS and the US are in the A1 state at the same time, an association is learned between the two stimuli. This means that if, at a later time, the CS is presented ahead of the US, and some CS elements enter A1, these elements will activate some US elements. However, US elements activated indirectly in this way only get boosted to the A2 state. (This can be thought of the CS arousing a memory of the US, which will not be as strong as the real thing.) With repeated CS-US trials, more and more elements are associated, and more and more US elements go to A2 when the CS comes on. This gradually leaves fewer and fewer US elements that can enter A1 when the US itself appears. In consequence, learning slows down and approaches a limit. One might say that the US is "fully predicted" or "not surprising" because almost all of its elements can only enter A2 when the CS comes on, leaving few to form new associations.

The model can explain the findings that are accounted for by the Rescorla-Wagner model and a number of additional findings as well. For example, unlike most other models, SOP takes time into account. The rise and decay of element activation enables the model to explain time-dependent effects such as the fact that conditioning is strongest when the CS comes just before the US, and that when the CS comes after the US ("backward conditioning") the result is often an inhibitory CS. Many other more subtle phenomena are explained as well.<ref name="Bouton_2016"/>

A number of other powerful models have appeared in recent years which incorporate element representations. These often include the assumption that associations involve a network of connections between "nodes" that represent stimuli, responses, and perhaps one or more "hidden" layers of intermediate interconnections. Such models make contact with a current explosion of research on [[Artificial neural network|neural networks]], [[artificial intelligence]] and [[machine learning]].{{Citation needed|date=July 2021}}