Editing Reinforcement (section)

== Intermittent reinforcement schedules ==

Behavior is not always reinforced every time it is emitted, and the pattern of reinforcement strongly affects how fast an operant response is learned, what its rate is at any given time, and how long it continues when reinforcement ceases. The simplest rules controlling reinforcement are continuous reinforcement, where every response is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex ''schedules of reinforcement'' specify the rules that determine how and when a response will be followed by a reinforcer.

Specific schedules of reinforcement reliably induce specific patterns of response, and these rules apply across many different species. The varying consistency and predictability of reinforcement is an important influence on how the different schedules operate. Many simple and complex schedules were investigated at great length by [[B.F. Skinner]] using [[Columbidae|pigeons]].

===Simple schedules===

[[File:Schedule of reinforcement.png|thumb|right|A chart demonstrating the different response rate of the four simple schedules of reinforcement, each hatch mark designates a reinforcer being given]]
* '''Ratio schedule''' – the reinforcement depends only on the number of responses the organism has performed.
* '''Continuous reinforcement (CRF)''' – a schedule of reinforcement in which every occurrence of the instrumental response (desired response) is followed by the reinforcer.<ref name=Miltenberger/>{{rp|86}}

Simple schedules have a single rule to determine when a single type of reinforcer is delivered for a specific response.
* ''Fixed ratio'' (FR) – schedules deliver reinforcement after every ''n''th response.<ref name=Miltenberger/>{{rp|88}} An FR 1 schedule is synonymous with a CRF schedule.
#(ex. Every three times a rat presses a button, that rat receives a slice of cheese)  
* ''Variable ratio schedule'' (VR) – reinforced on average every ''n''th response, but not always on the ''n''th response.<ref name=Miltenberger/>{{rp|88}}
#(ex. Gamblers win 1 out every an 10 turns on a slot machine, however this is an average and they could hypothetically win on any given turn)
* ''Fixed interval'' (FI) – reinforced after ''n'' amount of time.
# (ex. Every 10 minutes, a rat receives a slice of cheese when it presses a button. Eventually, the rat will learn to ignore the button until each 10 minute interval has elapsed)
* ''Variable interval'' (VI) – reinforced on an average of ''n'' amount of time, but not always exactly ''n'' amount of time.<ref name=Miltenberger/>{{rp|89}}
# (ie. A radio host gives away concert tickets approximately every hour, but the exact minutes may vary)
* ''Fixed time'' (FT) – Provides a reinforcing stimulus at a fixed time since the last reinforcement delivery, regardless of whether the subject has responded or not. In other words, it is a non-contingent schedule.
* ''Variable time'' (VT) – Provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not.

Simple schedules are utilized in many differential reinforcement<ref>{{cite journal | vauthors = Vollmer TR, Iwata BA | title = Differential reinforcement as treatment for behavior disorders: procedural and functional variations | journal = Research in Developmental Disabilities | volume = 13 | issue = 4 | pages = 393–417 | date = 1992 | pmid = 1509180 | doi=10.1016/0891-4222(92)90013-v}}</ref> procedures:
* ''Differential reinforcement of alternative behavior'' (DRA) - A conditioning procedure in which an undesired response is decreased by placing it on [[Extinction (psychology)|extinction]] or, less commonly, providing contingent punishment, while simultaneously providing reinforcement contingent on a desirable response. An example would be a teacher attending to a student only when they raise their hand, while ignoring the student when he or she calls out.
* ''Differential reinforcement of other behavior'' (DRO) – Also known as omission training procedures, an instrumental conditioning procedure in which a positive reinforcer is periodically delivered only if the participant does something other than the target response. An example would be reinforcing any hand action other than nose picking.<ref name="Miltenberger" />{{rp|338}}
* ''Differential reinforcement of incompatible behavior'' (DRI) – Used to reduce a frequent behavior without [[punishment (psychology)|punishing]] it by reinforcing an incompatible response. An example would be reinforcing clapping to reduce nose picking 
* ''Differential reinforcement of low response rate'' (DRL) – Used to encourage low rates of responding. It is like an interval schedule, except that premature responses reset the time required between behavior.
* ''Differential reinforcement of high rate'' (DRH) – Used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement.

====Effects of different types of simple schedules====

* Fixed ratio: activity slows after reinforcer is delivered, then response rates increase until the next reinforcer delivery (post-reinforcement pause).
* Variable ratio: rapid, steady rate of responding; most resistant to [[Extinction (psychology)|extinction]].
* Fixed interval: responding increases towards the end of the interval; poor resistance to extinction.
* Variable interval: steady activity results, good resistance to extinction.
* Ratio schedules produce higher rates of responding than interval schedules, when the rates of reinforcement are otherwise similar.
* Variable schedules produce higher rates and greater resistance to [[extinction (psychology)|extinction]] than most fixed schedules. This is also known as the Partial Reinforcement Extinction Effect (PREE).
* The variable ratio schedule produces both the highest rate of responding and the greatest resistance to extinction (for example, the behavior of [[gambler]]s at [[slot machine]]s).
* Fixed schedules produce "post-reinforcement pauses" (PRP), where responses will briefly cease immediately following reinforcement, though the pause is a function of the upcoming response requirement rather than the prior reinforcement.<ref>{{cite journal | vauthors = Derenne A, Flannery KA | date = 2007 | title = Within Session FR Pausing. | journal = The Behavior Analyst Today | volume = 8 | issue = 2 | pages = 175–86 | doi=10.1037/h0100611}}</ref>
** The PRP of a fixed interval schedule is frequently followed by a "scallop-shaped" accelerating rate of response, while fixed ratio schedules produce a more "angular" response.
*** fixed interval scallop: the pattern of responding that develops with fixed interval reinforcement schedule, performance on a fixed interval reflects subject's accuracy in telling time.
* Organisms whose schedules of reinforcement are "thinned" (that is, requiring more responses or a greater wait before reinforcement) may experience "ratio strain" if thinned too quickly. This produces behavior similar to that seen during extinction.
** Ratio strain: the disruption of responding that occurs when a fixed ratio response requirement is increased too rapidly.
** Ratio run: high and steady rate of responding that completes each ratio requirement. Usually higher ratio requirement causes longer post-reinforcement pauses to occur.
* Partial reinforcement schedules are more resistant to extinction than continuous reinforcement schedules.
** Ratio schedules are more resistant than interval schedules and variable schedules more resistant than fixed ones.
** Momentary changes in reinforcement value lead to dynamic changes in behavior.<ref>{{cite journal |last1=McSweeney |first1=Frances K. |last2=Murphy |first2=Eric S. |last3=Kowal |first3=Benjamin P. | name-list-style = vanc |title=Dynamic changes in reinforcer value: Some misconceptions and why you should care. |journal=The Behavior Analyst Today |date=2001 |volume=2 |issue=4 |pages=341–349 |doi=10.1037/h0099952}}</ref>

===Compound schedules===

<!--Do these refer to the same behavior, the same reinforcers? Yes, they refer to the same behavior and same reinforcer except given at different schedules-->
Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are:
* ''Alternative schedules''' – A type of compound schedule where two or more simple schedules are in effect and whichever schedule is completed first results in reinforcement.<ref>{{cite book | vauthors = Iversen IH, Lattal KA | url = https://books.google.com/books?id=uVYJAwAAQBAJ | title = Experimental Analysis of Behavior | date = 1991 | publisher = Elsevier | location = Amsterdam |isbn = 9781483291260}}</ref>
* ''Conjunctive schedules'' – A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other, and requirements on all of the simple schedules must be met for reinforcement.
* ''Multiple schedules'' – Two or more schedules alternate over time, with a stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
* ''Mixed schedules'' – Either of two, or more, schedules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
*[[File:Operant_Conditioning_Involves_Choice.png|thumb|Administrating two reinforcement schedules at the same time]]''Concurrent schedules'' – A complex reinforcement procedure in which the participant can choose any one of two or more simple reinforcement schedules that are available simultaneously. Organisms are free to change back and forth between the response alternatives at any time.
* ''Concurrent-chain schedule of reinforcement''' – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement schedules will be in effect in the second link. Once a choice has been made, the rejected alternatives become unavailable until the start of the next trial.
* ''Interlocking schedules'' – A single schedule with two components where progress in one component affects progress in the other component. In an interlocking FR 60 FI 120-s schedule, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI schedule.
* ''Chained schedules'' – Reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started
* ''Tandem schedules'' – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started.
* ''Higher-order schedules'' – completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a response is reinforced.

===Superimposed schedules===

{{cleanup section|reason=convert Author (Year) citations to wiki style|date=January 2024}}
The [[psychology]] term ''superimposed schedules of reinforcement'' refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on the lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage pecking at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks.

Superimposed schedules of reinforcement are a type of compound schedule that evolved from the initial work on simple [[schedules of reinforcement]] by [[B.F. Skinner]] and his colleagues (Skinner and Ferster, 1957). They demonstrated that reinforcers could be delivered on schedules, and further that organisms behaved differently under different schedules. Rather than a reinforcer, such as food or water, being delivered every time as a consequence of some behavior, a reinforcer could be delivered after more than one instance of the behavior. For example, a [[pigeon]] may be required to peck a button switch ten times before food appears. This is a "ratio schedule". Also, a reinforcer could be delivered after an interval of time passed following a target behavior. An example is a [[rat]] that is given a food pellet immediately following the first response that occurs after two minutes has elapsed since the last lever press. This is called an "interval schedule".

In addition, ratio schedules can deliver reinforcement following fixed or variable number of behaviors by the individual organism. Likewise, interval schedules can deliver reinforcement following fixed or variable intervals of time following a single response by the organism. Individual behaviors tend to generate response rates that differ based upon how the reinforcement schedule is created. Much subsequent research in many labs examined the effects on behaviors of scheduling reinforcers.

If an organism is offered the opportunity to choose between or among two or more simple schedules of reinforcement at the same time, the reinforcement structure is called a "concurrent schedule of reinforcement". Brechner (1974, 1977) introduced the concept of superimposed [[schedules of reinforcement]] in an attempt to create a laboratory analogy of [[social trap]]s, such as when humans [[overharvest]] their fisheries or tear down their rainforests. Brechner created a situation where simple reinforcement schedules were superimposed upon each other. In other words, a single response or group of responses by an organism led to multiple consequences. Concurrent schedules of reinforcement can be thought of as "or" schedules, and superimposed schedules of reinforcement can be thought of as "and" schedules. Brechner and Linder (1981) and Brechner (1987) expanded the concept to describe how superimposed schedules and the [[social trap]] analogy could be used to analyze the way [[energy]] flows through [[system]]s.

Superimposed schedules of reinforcement have many real-world applications in addition to generating [[social trap]]s. Many different human individual and social situations can be created by superimposing simple reinforcement schedules. For example, a human being could have simultaneous tobacco and alcohol addictions. Even more complex situations can be created or simulated by superimposing two or more concurrent schedules. For example, a high school senior could have a choice between going to Stanford University or UCLA, and at the same time have the choice of going into the Army or the Air Force, and simultaneously the choice of taking a job with an internet company or a job with a software company. That is a reinforcement structure of three superimposed concurrent schedules of reinforcement.

Superimposed schedules of reinforcement can create the three classic conflict situations (approach–approach conflict, [[approach–avoidance conflict]], and avoidance–avoidance conflict) described by [[Kurt Lewin]] (1935) and can operationalize other Lewinian situations analyzed by his [[force field analysis]]. Other examples of the use of superimposed schedules of reinforcement as an analytical tool are its application to the contingencies of rent control (Brechner, 2003) and problem of toxic waste dumping in the Los Angeles County storm drain system (Brechner, 2010).

===Concurrent schedules===

In [[operant conditioning]], concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, in a [[two-alternative forced choice]] task, a [[pigeon]] in a [[Skinner box]] is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other.

It is not necessary for responses on the two schedules to be physically distinct. In an alternate way of arranging concurrent schedules, introduced by Findley in 1958, both schedules are arranged on a single key or other response device, and the subject can respond on a second key to change between the schedules. In such a "Findley concurrent" procedure, a stimulus (e.g., the color of the main key) signals which schedule is in effect.

Concurrent schedules often induce rapid alternation between the keys. To prevent this, a "changeover delay" is commonly introduced: each schedule is inactivated for a brief period after the subject switches to it.

When both the concurrent schedules are [[variable interval schedule of reinforcement|variable intervals]], a quantitative relationship known as the [[matching law]] is found between relative response rates in the two schedules and the relative reinforcement rates they deliver; this was first observed by [[R.J. Herrnstein]] in 1961. Matching law is a rule for instrumental behavior which states that the relative rate of responding on a particular response alternative equals the relative rate of reinforcement for that response (rate of behavior = rate of reinforcement). Animals and humans have a tendency to prefer choice in schedules.<ref>{{cite journal | vauthors = Martin TL, Yu CT, Martin GL, Fazzio D | year = 2006 | title = On Choice, Preference, and Preference For Choice. | journal = The Behavior Analyst Today | volume = 7 | issue = 2 | pages = 234–48 | doi=10.1037/h0100083| pmid = 23372459 | pmc = 3558524 }}</ref>