Editing Operant conditioning (section)

===Modifying operant behavior: reinforcement and punishment===
{{Main|Reinforcement|Punishment (psychology)}}
Reinforcement and punishment are the core tools through which operant behavior is modified. These terms are defined by their effect on behavior. "Positive" and "negative" refer to whether a stimulus was added or removed, respectively. Similarly, "reinforcement" and "punishment" refer to the future frequency of the behavior. Reinforcement describes a consequence that makes a behavior occur more often in the future, whereas punishment is a consequence that makes a behavior occur less often. <ref>{{cite book |last1=Cooper |first1=JO |last2=Heron |first2=TE |last3=Heward |first3=WL |title=Applied Behavior Analysis |date=2019 |publisher=Pearson Education (US) |isbn=978-0134752556 |pages=33 |edition=3rd}}</ref>  

There are a total of four consequences:

# '''[[Positive reinforcement]]''' occurs when a behavior (response) results in a desired stimulus being added and increases the frequency of that behavior in the future.<ref name="Schultz">{{cite journal|year=2015|title=Neuronal reward and decision signals: from theories to data|journal=Physiological Reviews|volume=95|issue=3|pages=853–951|doi=10.1152/physrev.00023.2014|pmc=4491543|pmid=26109341|quote=Rewards in operant conditioning are positive reinforcers.&nbsp;... Operant behavior gives a good definition for rewards. Anything that makes an individual come back for more is a positive reinforcer and therefore a reward. Although it provides a good definition, positive reinforcement is only one of several reward functions.&nbsp;... Rewards are attractive. They are motivating and make us exert an effort.&nbsp;... Rewards induce approach behavior, also called appetitive or preparatory behavior, and consummatory behavior.&nbsp;... Thus any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it is by definition a reward.|vauthors=Schultz W}}</ref> '''Example''': if a rat in a [[Skinner box]] gets food when it presses a lever, its rate of pressing will go up. Pressing the lever was positively reinforced.
# '''[[Negative reinforcement]]''' (a.k.a. escape) occurs when a behavior (response) is followed by the removal of an [[aversive]] stimulus, thereby increasing the original behavior's frequency. '''Example''': A child is afraid of loud noises at a fireworks display. They put on a pair of headphones, and they can no longer hear the fireworks. The next time the child sees fireworks, they put on a pair of headphones. Putting on headphones was negatively reinforced.
# '''[[Positive punishment]]''' (also referred to as "punishment by contingent stimulation") occurs when a behavior (response) is followed by an aversive stimulus which makes the behavior less likely to occur in the future. '''Example:''' A child touches a hot stove and burns his hand. The next time he sees a stove, he does not touch it. Touching the stove was positively punished.
# '''[[Negative punishment]]''' (penalty) (also called "punishment by contingent withdrawal") occurs when a behavior (response) is followed by the removal of a stimulus, and the behavior is less likely to occur in the future. '''Example''': When an employee puts their lunch in a communal refrigerator, it gets stolen before break time. The next time the employee brings a lunch to work, they do not put it in the refrigerator. Putting the lunch in the refrigerator was negatively punished.

* '''Extinction''' is a consequence strategy that occurs when a previously reinforced behavior is no longer reinforced with either positive or negative reinforcement. During extinction the behavior becomes less probable. Occasional reinforcement can lead to an even longer delay before behavior extinction due to the learning factor of repeated instances becoming necessary to get reinforcement, when compared with reinforcement being given at each opportunity before extinction.<ref>{{cite book |last1=Skinner |first1=B.F. |title=Science and Human Behavior |date=2014 |publisher=The B.F. Skinner Foundation |location=Cambridge, MA |page=70 |url=http://www.bfskinner.org/newtestsite/wp-content/uploads/2014/02/ScienceHumanBehavior.pdf |access-date=13 March 2019}}</ref> 

A study suggests that tactile feedback, such as haptic vibrations from mobile devices, can function as secondary reinforcers (i.e., learned rewards that acquire reinforcing value through association), strengthening consumer behaviors such as online purchasing.<ref>Hampton, W., & Morrin, M. (2025). "When Touch Drives Purchase: Haptic Rewards as Reinforcers of Online Buying." Journal of Consumer Research. https://doi.org/10.1093/jcr/ucaf025</ref>

====Schedules of reinforcement====
Schedules of reinforcement are rules that control the delivery of reinforcement. The rules specify either the time that reinforcement is to be made available, or the number of responses to be made, or both. Many rules are possible, but the following are the most basic and commonly used<ref>Schacter et al.2011 Psychology 2nd ed. pg.280–284 Reference for entire section Principles version 130317</ref><ref name="ReferenceA"/>

* Fixed interval schedule: Reinforcement occurs following the first response after a fixed time has elapsed after the previous reinforcement. This schedule yields a "break-run" pattern of response; that is, after training on this schedule, the organism typically pauses after reinforcement, and then begins to respond rapidly as the time for the next reinforcement approaches.
* Variable interval schedule: Reinforcement occurs following the first response after a variable time has elapsed from the previous reinforcement. This schedule typically yields a relatively steady rate of response that varies with the average time between reinforcements.
* Fixed ratio schedule: Reinforcement occurs after a fixed number of responses have been emitted since the previous reinforcement. An organism trained on this schedule typically pauses for a while after a reinforcement and then responds at a high rate. If the response requirement is low there may be no pause; if the response requirement is high the organism may quit responding altogether.
* Variable ratio schedule: Reinforcement occurs after a variable number of responses have been emitted since the previous reinforcement. This schedule typically yields a very high, persistent rate of response.
* Continuous reinforcement: Reinforcement occurs after each response. Organisms typically respond as rapidly as they can, given the time taken to obtain and consume reinforcement, until they are satiated.

====Factors that alter the effectiveness of reinforcement and punishment====
The effectiveness of reinforcement and punishment can be changed. 
# '''Satiation/Deprivation''': The effectiveness of a positive or "appetitive" stimulus will be reduced if the individual has received enough of that stimulus to satisfy his/her appetite. The opposite effect will occur if the individual becomes deprived of that stimulus: the effectiveness of a consequence will then increase. A subject with a full stomach wouldn't feel as motivated as a hungry one.<ref name = Miltenberger84>Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". [[Thomson/Wadsworth]], 2008. p. 84.</ref>
# '''Immediacy''': An immediate consequence is more effective than a delayed one. If one gives a dog a treat for sitting within five seconds, the dog will learn faster than if the treat is given after thirty seconds.<ref>Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". [[Thomson/Wadsworth]], 2008. p. 86.</ref>
# '''Contingency''': To be most effective, reinforcement should occur consistently after responses and not at other times. Learning may be slower if reinforcement is intermittent, that is, following only some instances of the same response. Responses reinforced intermittently are usually slower to extinguish than are responses that have always been reinforced.<ref name = Miltenberger84/>
# '''Size''': The size, or amount, of a stimulus often affects its potency as a reinforcer. Humans and animals engage in cost-benefit analysis. If a lever press brings ten food pellets, lever pressing may be learned more rapidly than if a press brings only one pellet.   A pile of quarters from a slot machine may keep a gambler pulling the lever longer than a single quarter.

Most of these factors serve biological functions.  For example, the process of satiation helps the organism maintain a stable internal environment ([[homeostasis]]). When an organism has been deprived of sugar, for example, the taste of sugar is an effective reinforcer. When the organism's [[blood sugar]] reaches or exceeds an optimum level the taste of sugar becomes less effective or even aversive.

====Shaping====
{{main|Shaping (psychology)}}
Shaping is a conditioning method often used in animal training and in teaching nonverbal humans. It depends on operant variability and reinforcement, as described above. The trainer starts by identifying the desired final (or "target") behavior. Next, the trainer chooses a behavior that the animal or person already emits with some probability. The form of this behavior is then gradually changed across successive trials by reinforcing behaviors that approximate the target behavior more and more closely. When the target behavior is finally emitted, it may be strengthened and maintained by the use of a schedule of reinforcement.

====Noncontingent reinforcement====
Noncontingent reinforcement is the delivery of reinforcing stimuli regardless of the organism's behavior. Noncontingent reinforcement may be used in an attempt to reduce an undesired target behavior by reinforcing multiple alternative responses while extinguishing the target response.<ref>{{cite journal|last1=Tucker|first1=M.|last2=Sigafoos|first2=J.|last3=Bushell|first3=H.|year=1998|title=Use of noncontingent reinforcement in the treatment of challenging behavior|journal=Behavior Modification|volume=22|issue=4|pages=529–547|doi=10.1177/01454455980224005|pmid=9755650|s2cid=21542125}}</ref> As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".<ref>{{cite journal|last1=Poling|first1=A.|last2=Normand|first2=M.|year=1999|title=Noncontingent reinforcement: an inappropriate description of time-based schedules that reduce behavior|journal=Journal of Applied Behavior Analysis|volume=32|issue=2|pages=237–238|doi=10.1901/jaba.1999.32-237|pmc=1284187}}</ref>