All Topics  
Reinforcement

 
Reinforcement

   Email Print
   Bookmark   Link






 

Reinforcement



 
 
In operant conditioning
Operant conditioning

Operant conditioning is the use of consequences to modify the occurrence and form of behavior. Operant conditioning is distinguished from classical conditioning in that operant conditioning deals with the Behavior modification or operant behavior....
, reinforcement occurs when an event following a response causes an increase in the probability of that response occurring in the future. Response strength can be assessed by measures such as the frequency with which the response is made (for example, a pigeon may peck a key more times in the session), or the speed with which it is made (for example, a rat may run a maze faster). The environment change contingent upon the response is called a reinforcer.








Discussion
Ask a question about 'Reinforcement'
Start a new discussion about 'Reinforcement'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


In operant conditioning
Operant conditioning

Operant conditioning is the use of consequences to modify the occurrence and form of behavior. Operant conditioning is distinguished from classical conditioning in that operant conditioning deals with the Behavior modification or operant behavior....
, reinforcement occurs when an event following a response causes an increase in the probability of that response occurring in the future. Response strength can be assessed by measures such as the frequency with which the response is made (for example, a pigeon may peck a key more times in the session), or the speed with which it is made (for example, a rat may run a maze faster). The environment change contingent upon the response is called a reinforcer.

Types of reinforcement

B.F. Skinner, the researcher who articulated the major theoretical constructs of reinforcement and behaviorism
Behaviorism

Behaviorism or Behaviourism,also called the learning perspective is a philosophy of psychology based on the proposition that all things which organisms do ? including acting, thinking and feeling?can and should be regarded as behaviors....
, refused to specify causal origins of reinforcers. Skinner argued that reinforcers are defined by a change in response strength (that is, functionally rather than causally), and that which is a reinforcer to one person may not be to another. Accordingly, activities, foods or items which are generally considered pleasant or enjoyable may not necessarily be reinforcing; they can only be considered so if the behavior that immediately precedes the potential reinforcer increases in similar future situations. If a child receives a cookie when he or she asks for one, and the frequency of 'cookie-requesting behavior' increases, the cookie can be seen as reinforcing 'cookie-requesting behavior'. If however, cookie-requesting behavior does not increase, the cookie cannot be considered reinforcing. The sole criterion which can determine if an item, activity or food is reinforcing is the change in the probability of a behavior after the administration of a potential reinforcer. Other theories may focus on additional factors such as whether the person expected the strategy to work at some point, but a behavioral theory of reinforcement would focus specifically upon the probability of the behavior.

The study of reinforcement has produced an enormous body of reproducible
Reliability (statistics)

In statistics, reliability is the consistency of a set of measurements or measuring instrument, often used to describe a Test . This can either be whether the measurements of the same instrument give or are likely to give the same measurement , or in the case of more subjective instruments, such as personality or trait inventories, whether t...
 experimental results. Reinforcement is the central concept and procedure in the experimental analysis of behavior
Experimental analysis of behavior

The experimental analysis of behavior is the name given to school of psychology founded by B. F. Skinner, and based on his philosophy of radical behaviorism....
 and much of quantitative analysis of behavior
Society for Quantitative Analysis of Behavior

The Society was founded in 1978 by Michael Commons and John Anthony Nevin. The first president was Richard Herrnstein. In the beginning it was called the Harvard Symposium on Quantitative Analysis of Behavior ....
.
  • Positive reinforcement is an increase in the future frequency of a behavior due to the addition of a stimulus immediately following a response. Giving (or adding) food to a dog contingent on its sitting is an example of positive reinforcement (if this results in an increase in the future behavior of the dog sitting).
  • Negative reinforcement is an increase in the future frequency of a behavior when the consequence is the removal of an aversive
    Aversives

    In psychology, aversives are suffering Stimulus which induce changes in behavior through punishment ; by applying an aversive immediately following a behavior, the likelihood of the behavior occurring in the future is reduced....
     stimulus. Turning off (or removing) an annoying song when a child asks their parent is an example of negative reinforcement (if this results in an increase in asking behavior of the child in the future).
    • Avoidance conditioning is a form of negative reinforcement that occurs when a behavior prevents an aversive stimulus from starting or being applied.


Skinner discusses that while it may appear so, Punishment
Punishment (psychology)

In operant conditioning, punishment is any change in a human or animal's surroundings that occurs after a given behavior or response which reduces the likelihood of that behavior occurring again in the future....
 is not the opposite of reinforcement. Rather, it has some other effects as well as decreasing undesired behavior.

  decreases likelihood of behavior increases likelihood of behavior
presented positive punishment positive reinforcement
taken away negative punishment negative reinforcement


Distinguishing "positive" from "negative" can be difficult, and the necessity of the distinction is often debated. For example, in a very warm room, a current of external air serves as positive reinforcement because it is pleasantly cool or negative reinforcement because it removes uncomfortably hot air. Some reinforcement can be simultaneously positive and negative, such as a drug addict
Drug addiction

Drug addiction is widely considered a Pathology. The disorder of addiction involves the progression of acute drug use to the development of drug-seeking behavior, the vulnerability to relapse, and the decreased, slowed ability to respond to naturally rewarding stimuli....
 taking drugs for the added euphoria and eliminating withdrawal
Withdrawal

Withdrawal, also known as withdrawal/abstinence syndrome, refers to the characteristic signs and symptoms that appear when a drug that causes physical dependence is regularly used for a long time and then suddenly discontinued or decreased in dosage....
 symptoms. Many behavioral psychologists simply refer to reinforcement or punishment
Punishment (psychology)

In operant conditioning, punishment is any change in a human or animal's surroundings that occurs after a given behavior or response which reduces the likelihood of that behavior occurring again in the future....
—without polarity—to cover all consequent environmental changes.

Primary reinforcers

A primary reinforcer, sometimes called an unconditioned reinforcer, is a stimulus that does not require pairing to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival. Examples of primary reinforcers include sleep, food, air, water, and sex. Other primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another abhors it. Or one person may eat lots of food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them.

Often primary reinforcers shift their reinforcing value temporarily through satiation and deprivation. Food, for example, may cease to be effective as a reinforcer after a certain amount of it has been consumed (satiation). After a period during which it does not receive any of the primary reinforcer (deprivation), however, the primary reinforcer may once again regain its effectiveness in increasing response strength.

Secondary reinforcers

A secondary reinforcer, sometimes called a conditioned reinforcer, is a stimulus or situation that has acquired its function as a reinforcer after pairing
Pairing

The concept of pairing treated here occurs in mathematics....
 with a stimulus which functions as a reinforcer. This stimulus may be a primary reinforcer or another conditioned reinforcer (such as money). An example of a secondary reinforcer would be the sound from a clicker, as used in clicker training
Clicker training

Clicker training is the process of training an animal using a Reinforcement, which indicates to the animal the precise behavior that was correct....
. The sound of the clicker has been associated with praise or treats, and subsequently, the sound of the clicker may function as a reinforcer. As with primary reinforcers, an organism can experience satiation and deprivation with secondary reinforcers.

Other reinforcement terms

  • A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers (such as money, a secondary generalized reinforcer).
  • In reinforcer sampling a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior. The stimulus may then later be used more effectively in reinforcement.
  • Socially mediated reinforcement (direct reinforcement) involves the delivery of reinforcement which requires the behavior of another organism.
  • Premack principle is a special case of reinforcement elaborated by David Premack
    David Premack

    David Premack is currently emeritus professor of psychology at the University of Pennsylvania.He is co-author, with Ann James Premack, of*The Mind of an Ape ...
    , which states that a highly preferred activity can be used effectively as a reinforcer for a less preferred activity.
  • Reinforcement hierarchy is a list of actions, rank-ordering the most desirable to least desirable consequences that may serve as a reinforcer. A reinforcement hierarchy can be used to determine the relative frequency and desirability of different activities, and is often employed when applying the Premack principle.
  • Contingent outcomes are more likely to reinforce behavior than non-contingent responses. Contingent outcomes are those directly linked to a causal
    Causality

    Causality denotes a necessary relationship between one event and another event which is the direct consequence of the first.While this informal understanding suffices in everyday use, the Philosophy analysis of how best to characterize causality extends over millennia....
     behavior, such a light turning on being contingent on flipping a switch. Note that contingent outcomes are not necessary to demonstrate reinforcement, but perceived contingency may increase learning.
  • Contiguous stimuli are stimuli closely associated by time and space with specific behaviors. They reduce the amount of time needed to learn a behavior while increasing its resistance to extinction
    Extinction

    In biology and ecology, extinction is the death of every member of a species or group of taxon. The moment of extinction is generally considered to be the death of the last individual of that species ....
    . Giving a dog a piece of food immediately after sitting is more contiguous with (and therefore more likely to reinforce) the behavior than a several minute delay in food delivery following the behavior.
  • Noncontingent reinforcement refers to response-independent delivery of stimuli identified serve as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which serves to decrease the rate of the target behavior. As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".


Natural and artificial reinforcement

In his 1967 paper, Arbitrary and Natural Reinforcement, Charles Ferster
Charles Ferster

Charles Bohris Ferster was an American behavioral psychologist....
 proposed that reinforcement can be classified into events which increase the frequency of an operant as a natural consequence of the behavior itself, and those which are presumed to affect frequency by their requirement of human mediation, such as in a token economy
Token economy

A token economy is a system of behavior modification based on the principles of operant conditioning. Contingency management systems are often employed by those who practice applied behavior analysis....
 where subjects are "rewarded" for certain behavior with an arbitrary token of a negotiable value. In 1970, Baer and Wolf created a name for the use of natural reinforcers called behavior traps. A behavior trap is one in which only a simple response is necessary to enter the trap, yet once entered, the trap cannot be resisted in creating general behavior change. It is the use of a behavioral trap that will increase one's repertoire by exposing a person to the naturally occurring reinforcement of that behavior. Behavior traps have four characteristics:
  • They are "baited" with virtually irresistible reinforcers that "lure" the student to the trap
  • Only a low-effort response already in the repertoire is necessary to enter the trap
  • Interrelated contingencies of reinforcement inside the trap motivate the person to acquire, extend, and maintain targeted academic/social skills
  • they can remain effective for long time because the person shows few, if any, satiation effects.


As can be seen from the above, artificial reinforcement is created to build or develop skills, and to generalize, it is important that either a behavior trap is introduced to 'capture' the skill and utilize naturally occurring reinforcement to maintain or increase it. This behavior trap may simply be a social situation that will generally result from a specific behavior once it has met a certain criterion (ex: if you use edible reinforcers to train a person to say hello and smile at people when they meet them, after that skill has been built up, the natural reinforcer of other people smiling, and having more friendly interactions will naturally reinforce the skill and the edibles can be faded).

Schedules of reinforcement

When an animal's surroundings are controlled
Scientific control

Scientific controls are a vital part of the scientific method, since they can eliminate or minimise unintended influences such as researcher bias, environmental changes and biological variation....
, its behavior patterns after reinforcement become predictable, even for very complex behavior patterns. A schedule of reinforcement is the protocol for determining when responses or behaviors will be reinforced, ranging from continuous reinforcement, in which every response is reinforced, and extinction
Extinction (psychology)

Extinction in psychology refers to the lowering of the probability of a response when a characteristic reinforcing stimulus is no longer presented....
, in which no response is reinforced. Between these extremes is intermittent or partial reinforcement where only some responses are reinforced.

Specific variations of intermittent reinforcement reliably induce specific patterns of response, irrespective of the species being investigated (including humans in some conditions). The orderliness and predictability of behaviour under schedules of reinforcement was evidence for B. F. Skinner
B. F. Skinner

Burrhus Frederic Skinner was an influential American psychologist, author, inventor, advocate for social reform,and poet. He was the Edgar Pierce Professor of Psychology at Harvard University from 1958 until his retirement in 1974....
's claim that using operant conditioning he could obtain "control over behaviour", in a way that rendered the theoretical disputes of contemporary comparative psychology
Comparative psychology

Psychologists and scientists do not always agree on what should be considered Comparative Psychology. Taken in its most usual, broad sense, it refers to the study of the behavior and mental life of animals other than human beings....
 obsolete. The reliability of schedule control supported the idea that a radical behaviourist
Radical behaviorism

Radical behaviorism is a philosophy developed by B. F. Skinner that underlies the experimental analysis of behavior approach to psychology. The term 'radical behaviorism' applies to a particular school that emerged during the reign of behaviorism....
 experimental analysis of behavior
Experimental analysis of behavior

The experimental analysis of behavior is the name given to school of psychology founded by B. F. Skinner, and based on his philosophy of radical behaviorism....
 could be the foundation for a psychology
Psychology

Psychology is an academic and applied science discipline involving the science study of human mental functions and behavior. Occasionally it also relies on symbolic hermeneutics and critical theory, although these traditions are less pronounced than in other social sciences such as sociology....
 that did not refer to mental or cognitive processes. The reliability of schedules also led to the development of Applied Behavior Analysis
Applied Behavior Analysis

Applied behavior analysis is the science of applying experimentally derived principles of behavior to improve socially significant behavior. ABA takes what we know about behavior and uses it to bring about positive change ....
 as a means of controlling or altering behavior.

Many of the simpler possibilities, and some of the more complex ones, were investigated at great length by Skinner using pigeons, but new schedules continue to be defined and investigated.

Simple schedules


Simple schedules have a single rule to determine when a single type of reinforcer is delivered for specific response.
  • Fixed ratio (FR) schedules deliver reinforcement after every nth response
    • Example: FR2 = every second response is reinforced
    • Lab example: FR5 = rat reinforced with food after each 5 bar-presses in a Skinner box
      Skinner box

      An operant conditioning chamber is a laboratory equipment used in the experimental analysis of behavior to study animal behavior. The operant conditioning chamber was created by B....
      .
    • Real-world example: FR10 = Used car dealer gets a $1000 bonus for each 10 cars sold on the lot.
  • Continuous ratio (CRF) schedules are a special form of a fixed ratio. In a continuous ratio schedule, reinforcement follows each and every response.
    • Lab example: each time a rat presses a bar it gets a pellet of food
    • Real world example: each time a dog defecates outside its owner gives it a treat
  • Fixed interval (FI) schedules deliver reinforcement for the first response after a fixed length of time since the last reinforcement, while premature responses are not reinforced.
    • Example: FI1" = reinforcement provided for the first response after 1 second
    • Lab example: FI15" = rat is reinforced for the first bar press after 15 seconds passes since the last reinforcement
    • Real world example: FI24 hour = calling a radio station is reinforced with a chance to win a prize, but the person can only sign up once per day
  • Variable ratio (VR) schedules deliver reinforcement after a random number of responses (based upon a predetermined average)
    • Example: VR3 = on average, every third response is reinforced
    • Lab example: VR10 = on average, a rat is reinforced for each 10 bar presses
    • Real world example: VR37 = a roulette
      Roulette

      Roulette is a casino and gambling game named after the French language word meaning "small wheel". In the game, players may choose to place bets on either a number, a range of numbers, the color red or black, or whether the number is odd or even....
       player betting on specific numbers will win on average one every 37 tries (on a U.S. roulette wheel, this would be VR38)
  • Variable interval (VI) schedules deliver reinforcement for the first response after a random average length of time passes since the last reinforcement
    • Example: VI3" = reinforcement is provided for the first response after an average of 3 seconds since the last reinforcement.
    • Lab example: VI10" = a rat is reinforced for the first bar press after an average of 10 seconds passes since the last reinforcement
    • Real world example: a predator can expect to come across a prey on a variable interval schedule


Other simple schedules include:
  • Differential reinforcement of incompatible behavior (DRI) is used to reduce a frequent behavior without punishing
    Punishment (psychology)

    In operant conditioning, punishment is any change in a human or animal's surroundings that occurs after a given behavior or response which reduces the likelihood of that behavior occurring again in the future....
     it by reinforcing an incompatible response. An example would be reinforcing clapping to reduce nose picking.
  • Differential reinforcement of other behavior (DRO) is used to reduce a frequent behavior by reinforcing any behavior other than the undesired one. An example would be reinforcing any hand action other than nose picking.
  • Differential reinforcement of low response rate (DRL) is used to encourage low rates of responding. It is like an interval schedule, except that premature responses reset the time required between behavior.
    • Lab example: DRL10" = a rat is reinforced for the first response after 10 seconds, but if the rat responds earlier than 10 seconds there is no reinforcement and the rat has to wait 10 seconds from that premature response without another response before bar pressing will lead to reinforcement.
    • Real world example: "If you ask me for a potato chip no more than once every 10 minutes, I will give it to you. If you ask more often, I will give you none."
  • Differential reinforcement of high rate (DRH) is used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement.
    • Lab example: DRH10"/15 responses = a rat must press a bar 15 times within a 10 second increment in order to be reinforced
    • Real world example: "If Lance Armstrong
      Lance Armstrong

      Lance Armstrong is an United States professional Road bicycle racing who rides for UCI ProTeam Team Astana. He won the Tour de France a record-breaking seven consecutive years, from 1999 Tour de France to 2005 Tour de France....
       is going to win the Tour de France
      Tour de France

      The Tour de France is a bicycle racing over more than . It is held every year. It is held in France and visits a bordering country every year. It usually lasts 23 days....
       he has to pedal x number of times during the y hour race."
  • Fixed Time (FT) provides reinforcement at a fixed time since the last reinforcement, irrespective of whether the subject has responded or not. In other words, it is a non-contingent schedule


    • Lab example: FT5": rat gets food every 5" regardless of the behavior.
    • Real world example: a person gets an annuity check every month regardless of behavior between checks
  • Variable Time (VT) provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not.


Effects of different types of simple schedules
  • Ratio schedules produce higher rates of responding than interval schedules, when the rates of reinforcement are otherwise similar.
  • Variable schedules produce higher rates and greater resistance to extinction
    Extinction (psychology)

    Extinction in psychology refers to the lowering of the probability of a response when a characteristic reinforcing stimulus is no longer presented....
     than most fixed schedules. This is also known as the Partial Reinforcement Extinction Effect (PREE)
  • The variable ratio schedule produces both the highest rate of responding and the greatest resistance to extinction (an example would be the behavior of gamblers
    Gambling

    Gambling is the wikt:wager#Verb of money or something of material Value on an event with an uncertain outcome with the primary intent of winning additional money and/or material goods....
     at slot machine
    Slot machine

    A slot machine , fruit machine , or poker machine is a casino gambling machine with three or more reels which spin when a button is pushed....
    s)
  • Fixed schedules produce 'post-reinforcement pauses' (PRP), where responses will briefly cease immediately following reinforcement, though the pause is a function of the upcoming response requirement rather than the prior reinforcement.
    • The PRP of a fixed interval schedule is frequently followed by an accelerating rate of response which is "scallop shaped," while those of fixed ratio schedules are more angular.
  • Organisms whose schedules of reinforcement are 'thinned' (that is, requiring more responses or a greater wait before reinforcement) may experience 'ratio strain' if thinned too quickly. This produces behavior similar to that seen during extinction.
  • Partial reinforcement schedules are more resistant to extinction than continuous reinforcement schedules.
    • Ratio schedules are more resistant than interval schedules and variable schedules more resistant than fixed ones.


Compound schedules

Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behaviour. There are many possibilities; among those most often used are:
  • Alternative schedules - A type of compound schedule where two or more simple schedules are in effect and which ever simple schedule is completed first results in reinforcement.
  • Conjunctive schedules - A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other and requirements on all of the simple schedules must be met for reinforcement.
  • Multiple schedules - either of two, or more, schedules may occur with a stimulus indicating which is in force.
    • Example: FR4 when given a whistle and FI 6 when given a bell ring.
  • Mixed schedules - either of two, or more, schedules may occur with no stimulus indicating which is in force.
    • Example: FI6 and then VR 3 without any stimulus warning of the change in schedule.
  • Concurrent schedules - two schedules are simultaneously in force though not necessarily on two different response devices, and reinforcement on those schedules is independent of each other.
  • Interlocking Schedules - A single schedule with two components where progress in one component affects progress in the other component. An interlocking FR60-FI120, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI.
  • Chained schedules - reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started.
    • Example: FR10 in a green light when completed it goes to a yellow light to indicate FR 3, after it's completed it goes into red light to indicate VI 6, etc. At the end of the chain, a reinforcer is given.
  • Tandem schedules - reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started.
    • Example: VR 10, after it is completed the schedule is changed without warning to FR 10, after that it is changed without warning to FR 16, etc. At the end of the series of schedules, a reinforcer is finally given.
  • Higher order schedules - completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI 10 secs), two successive fixed interval schedules would have to be completed before a response is reinforced.


Superimposed schedules

Superimposed schedules of reinforcement is a term in psychology
Psychology

Psychology is an academic and applied science discipline involving the science study of human mental functions and behavior. Occasionally it also relies on symbolic hermeneutics and critical theory, although these traditions are less pronounced than in other social sciences such as sociology....
 which refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. The reinforcers can be positive and/or negative. An example would be a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement would be a pigeon in an experimental cage pecking at a button. The pecks result in a hopper of grain being delivered every twentieth peck and access to water becoming available after every two hundred pecks.

Superimposed schedules of reinforcement are a type of compound schedule that evolved from the initial work on simple schedules of reinforcement by B. F. Skinner
B. F. Skinner

Burrhus Frederic Skinner was an influential American psychologist, author, inventor, advocate for social reform,and poet. He was the Edgar Pierce Professor of Psychology at Harvard University from 1958 until his retirement in 1974....
 and his colleagues (Skinner and Ferster, 1957). They demonstrated that reinforcers could be delivered on schedules, and further that organisms behaved differently under different schedules. Rather than a reinforcer, such as food or water, being delivered every time as a consequence of some behavior, a reinforcer could be delivered after more than one instance of the behavior. For example, a pigeon may be required to peck a button switch ten times before food is made available to the pigeon. This is called a "ratio schedule." Also, a reinforcer could be delivered after an interval of time passed following a target behavior. An example is a rat
Rat

Rats are various medium sized, long-tailed rodents of the Family Muroidea. "True rats" are members of the genus Rattus, the most important of which to humans are the black rat, Rattus rattus, and the brown rat, Rattus norvegicus....
 that is given a food pellet two minutes after the rat pressed a lever. This is called an "interval schedule." In addition, ratio schedules can deliver reinforcement following fixed or variable number of behaviors by the individual organism. Likewise, interval schedules can deliver reinforcement following fixed or variable intervals of time following a single response by the organism. Individual behaviors tend to generate response rates that differ based upon how the reinforcement schedule is created. Much subsequent research in many labs examined the effects on behaviors of scheduling reinforcers. If an organism is offered the opportunity to choose between or among two or more simple schedules of reinforcement at the same time, the reinforcement structure is called a "concurrent schedule of reinforcement." Brechner (1974, 1977) introduced the concept of "superimposed schedules of reinforcement in an attempt to create a laboratory analogy of social trap
Social trap

Social trap is a term used by psychologists to describe a situation in which a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole....
s, such as when humans overharvest their fisheries or tear down their rainforests. Brechner created a situation where simple reinforcement schedules were superimposed upon each other. In other words, a single response or group of responses by an organism led to multiple consequences. Concurrent schedules of reinforcement can be thought of as "or" schedules, and superimposed schedules of reinforcement can be thought of as "and" schedules. Brechner and Linder (1981) and Brechner (1987) expanded the concept to describe how superimposed schedules and the social trap
Social trap

Social trap is a term used by psychologists to describe a situation in which a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole....
 analogy could be used to analyze the way energy
Energy

In physics, energy is a scalar physical quantity that describes the amount of Work_ that can be performed by a force. Energy is an attribute of objects and systems that is subject to a conservation law....
 flows through system
System

System is a set of interacting or interdependent entities, real or abstract, forming an integrated whole.The concept of an "integrated whole" can also be stated in terms of a system embodying a set of relationships which are differentiated from relationships of the set to other elements, and from relationships between an element of the se...
s.

Superimposed schedules of reinforcement have many real-world applications in addition to generating social trap
Social trap

Social trap is a term used by psychologists to describe a situation in which a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole....
s. Many different human individual and social situations can be created by superimposing simple reinforcement schedules. For example a human being could have simultaneous tobacco and alcohol addictions. Even more complex situations can be created or simulated by superimposing two or more concurrent schedules. For example, a high school senior could have a choice between going to Stanford University or UCLA, and at the same time have the choice of going into the Army or the Air Force, and simultaneously the choice of taking a job with an internet company or a job with a software company. That would be a reinforcement structure of three superimposed concurrent schedules of reinforcement. Superimposed schedules of reinforcement can be used to create the three classic conflict situations (approach-approach conflict, approach-avoidance conflict
Approach-avoidance conflict

Approach-Avoidance conflicts are choices between something positive, say going out to a party, that has a negative Valence , say getting grounded for being at the party....
, and avoidance-avoidance conflict) described by Kurt Lewin
Kurt Lewin

Kurt Zadek Lewin , a German-born psychology, is one of the modern pioneers of social psychology, industrial and organizational psychology, and applied psychology....
 (1935)and can be used to operationalize other Lewinian situations analyzed by his force field analysis
Force field analysis

Force field analysis is an influential development in the field of social science. It provides a framework for looking at the factors that influence a situation, originally social situations....
. Another example of the use of superimposed schedules of reinforcement as an analytical tool is its application to the contingencies of rent control (Brechner, 2003).

Concurrent schedules

In operant conditioning
Operant conditioning

Operant conditioning is the use of consequences to modify the occurrence and form of behavior. Operant conditioning is distinguished from classical conditioning in that operant conditioning deals with the Behavior modification or operant behavior....
, concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, a pigeon in a Skinner box
Skinner box

An operant conditioning chamber is a laboratory equipment used in the experimental analysis of behavior to study animal behavior. The operant conditioning chamber was created by B....
 might be faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may have some links between them so that behaviour on one key affects the likelihood of reinforcement on the other.

It is not necessary for the responses on the two schedules to be physically distinct: in an alternative way of arranging concurrent schedules, introduced by Findley in 1958, both schedules are arranged on a single key or other response device, and the subject or participant can respond on a second key in order to change over between the schedules. In such a "Findley concurrent" procedure, a stimulus (e.g. the colour of the main key) is used to signal which schedule is currently in effect.

Concurrent schedules often induce rapid alternation between the keys. To prevent this, a "changeover delay" is commonly introduced: each schedule is inactivated for a brief period after the subject switches to it.

When both the concurrent schedules are variable intervals, a quantitative relationship known as the matching law
Matching law

In operant conditioning, the matching law is a quantitative relationship that holds between the relative rates of response and the relative rate of reinforcement in concurrent schedules of reinforcement....
 is found between relative response rates in the two schedules and the relative reinforcement rates they deliver; this was first observed by R. J. Herrnstein
Richard Herrnstein

Richard J. Herrnstein was a prominent United States researcher in animal learning in the B. F. Skinner tradition. He was one of the founders of Society for Quantitative Analysis of Behavior....
 in 1961.

Shaping

Shaping involves reinforcing successive, increasingly accurate approximations of a response desired by a trainer. In training a rat to press a lever, for example, simply turning toward the lever will be reinforced at first. Then, only turning and stepping toward it will be reinforced. As training progresses, the response reinforced becomes progressively more like the desired behavior.

Chaining

Chaining involves linking discrete behaviors together in a series, such that each result of each behaviour is both the reinforcement (or consequence) for the previous behavior, and the stimuli (or antecedent) for the next behavior. There are many ways to teach chaining, such as forward chaining (starting from the first behavior in the chain), backwards chaining (starting from the last behavior) and total task chaining (in which the entire behavior is taught from beginning to end, rather than as a series of steps). An example would be opening a locked door. First the key is inserted, then turned, then the door opened. Forward chaining would teach the subject first to insert the key. Once that task is mastered, they are told to insert the key, and taught to turn it. Once that task is mastered, they are told to perform the first two, then taught to open the door. Backwards chaining would involve the teacher first inserting and turning the key, and the subject is taught to open the door. Once that is learned, the teacher inserts the key, and the subject is taught to turn it, then opens the door as the next step. Finally, the subject is taught to insert the key, and they turn and open the door. Once the first step is mastered, the entire task has been taught. Total task chaining would involve teaching the entire task as a single series, prompting through all steps. Prompts are faded (reduced) at each step as they are mastered.

Criticisms

The standard definition of behavioral reinforcement has been criticized as circular
Circular definition

A circular definition is one that assumes a prior understanding of the term being defined. By using the term being defined as a part of the definition, a circular definition provides no new or useful information; either the audience already knows the meaning of the term, or the definition is deficient in including the term to be defined in th...
, since it appears to argue that response strength is increased by reinforcement while defining reinforcement as something which increases response strength; that is, the standard definition says only that response strength is increased by things which increase response strength. However, the correct usage of reinforcement is that something is a reinforcer because of its effect on behavior, and not the other way around. It becomes circular if one says that a particular stimulus strengthens behavior because it is a reinforcer, and should not be used to explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F. D. Sheffield's "consummatory behavior contingent on a response," but these are not broadly used in psychology.

History of the terms

In the 1920s Russian physiologist Ivan Pavlov
Ivan Pavlov

For other uses, see Pavlov.Ivan Petrovich Pavlov was a Russian Empire, and later Soviet, physiologist, psychologist, and physician. He was awarded the Nobel Prize in Physiology or Medicine in 1904 for research pertaining to the digestive system....
 may have been the first to use the word reinforcement with respect to behavior, but (according to Dinsmoor) he used its approximate Russian cognate sparingly, and even then it referred to strengthening an already-learned but weakening response. He did not use it, as it is today, for selecting and strengthening new behavior. Pavlov's introduction of the word extinction (in Russian) approximates today's psychological use.

In popular use, positive reinforcement is often used as a synonym for reward
Reward system

In neuroscience, the reward system is a collection of brain structures which attempts to regulate and control behavior by inducing pleasurable effects....
, with people (not behavior) thus being "reinforced," but this is contrary to the term's consistent technical usage, as it is a dimension of behavior, and not the person, which is strengthened. Negative reinforcement is often used by laypeople and even social scientists outside psychology as a synonym for punishment
Punishment (psychology)

In operant conditioning, punishment is any change in a human or animal's surroundings that occurs after a given behavior or response which reduces the likelihood of that behavior occurring again in the future....
. This is contrary to modern technical use, but it was B. F. Skinner
B. F. Skinner

Burrhus Frederic Skinner was an influential American psychologist, author, inventor, advocate for social reform,and poet. He was the Edgar Pierce Professor of Psychology at Harvard University from 1958 until his retirement in 1974....
 who first used it this way in his 1938 book. By 1953, however, he followed others in thus employing the word punishment, and he re-cast negative reinforcement for the removal of aversive stimuli.

There are some within the field of behavior analysis who have suggested that the terms "positive" and "negative" constitute an unnecessary distinction in discussing reinforcement as it is often unclear whether stimuli are being removed or presented. For example, Iwata poses the question: “…is a change in temperature more accurately characterized by the presentation of cold (heat) or the removal of heat (cold)?” (p. 363). Thus, it may be best to conceptualize reinforcement simply as a pre-change condition being replaced by a post-change condition which reinforces the behavior which was followed by the change in stimulus conditions.

See also

  • Dog training
    Dog training

    Dog training: the process of teaching a dog to perform certain actions in response to certain commands which the dog is trained to understand. It is a general term which does not, by itself, describe what or how the dog is taught....
  • Overjustification effect
    Overjustification effect

    The overjustification effect occurs when an external Wiktionary:incentive such as money or prizes decreases a person's Motivation#Intrinsic_and_extrinsic_motivation to perform a task....
  • Reinforcement learning
    Reinforcement learning

    Inspired by related psychological theory, in computer science, reinforcement learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward....
  • Reward system
    Reward system

    In neuroscience, the reward system is a collection of brain structures which attempts to regulate and control behavior by inducing pleasurable effects....
  • Society for Quantitative Analysis of Behavior
    Society for Quantitative Analysis of Behavior

    The Society was founded in 1978 by Michael Commons and John Anthony Nevin. The first president was Richard Herrnstein. In the beginning it was called the Harvard Symposium on Quantitative Analysis of Behavior ....


Footnotes


External links