Instrumental conditioning is a form of behaviorist learning. It involves using behavioral consequences to affect the likelihood of an action happening again.
At its core, the instrumental conditioning approach holds that:
- Behaviors that are followed by positive consequences are more likely to occur again.
- Behaviors that are followed by negative consequences are less likely to occur again.
Instrumental conditioning is often referred to as operant conditioning.
Instrumental Conditioning Explanation and Overview
Instrumental conditioning is a concept in psychology that explains how people and animals develop learned responses through the repetition of positive reinforcement, negative reinforcement, and punishment.
The most well-known psychologist that studied instrumental conditioning is B. F. Skinner (1965).
Skinner conducted extensive studies on positive and negative consequences using an apparatus he invented called the Skinner Box.
A Skinner Box is a small cage that contains a lever, a place where food pellets can be delivered, a wire floor that can be electrified, and a light. When the light turns on (the light being a discriminative stimulus), the animal would know to press the lever and receive food.
By manipulating how often and when a food pellet would be delivered based on the animal subject pressing the lever, Skinner was able to identify four main schedules of reinforcement: fixed ratio, variable ratio, fixed interval, and variable interval, explained below.
Schedules of Reinforcement in Instrumental Conditioning
There are four general ways in which respondent conditioning occurs, known as schedules of reinforcement. These refer to different ways in which a person goes about providing rewards and punishments over time. Each schedule of reinforcement produces a different pattern of behavior (Ferster & Skinner, 1957).
The four schedules of reinforcement are presented below.
1. Fixed Ratio Schedule
The fixed ratio schedule of reinforcement delivers a reward based on a specific number of behaviors occurring.
As an example, an FR-10 schedule will deliver a reward after 10 instances of the target behavior; regardless of the amount of time that has elapsed.
Generally speaking, fixed rations schedules produce quick acquisition of the target behavior. The organism will quickly figure out which specific target behavior is being rewarded.
This schedule produces a strong rate of behavior. However, when reinforcement ceases, the target behavior also ceases quickly. This is called extinction.
Shortly after reinforcement termination, the organism may exhibition an extinction burst, which is a sudden increase in the target behavior.
Another notable pattern of behavior in this schedule is the post-reinforcement pause. After each reinforcement has been delivered, there is a slight pause in behavior.
2. Variable Ratio Schedule
The variable ratio schedule is similar to the fixed ration schedule, only instead of the number of target behaviors being fixed, it varies.
For example, with a VR-10 schedule, the target behavior may be reinforced after 7 instances, then 11, then 8, then 15. The number changes after each reinforcement. Although the number changes, the average will be 10, hence the schedule denoted as VR-10.
This schedule produces quick acquisition, a high rate of behavior, and no post-reinforcement pause.
Once reinforcement is terminated, extinction is slow to occur. It takes some time for the organism to discern that the target behavior is no longer rewarded.
3. Fixed Interval Schedule
The fixed interval schedule is focused on time. After a specific interval of time has elapsed, then the very next instance of the target behavior is reinforced.
The interval of time does not change and the number of target behaviors that occur during the interval is irrelevant.
Speed of behavioral acquisition and extinction depends on the length of the interval; the shorter the interval, the quicker the behavior will be acquired and extinguished.
The FI schedule produces a unique pattern of behavior called scalloping. This refers to the rate of behavior decreasing immediately after reinforcement (i.e., post-reinforcement pause), and then increasing as the next interval elapses.
4. Variable Interval Schedule
The variable interval schedule is similar to the FI schedule, except that the interval of time varies.
For instance, the first interval may be 7 minutes, the next 9, followed by 4, and then maybe 10.
This schedule produces a moderate but steady rate of behavior, slow acquisition and slow extinction.
Instrumental Conditioning Examples
- Being Selected for a Job Interview: Being selected for a job interview can seem quite random if applying for a lot of jobs. Sometimes sending your resume in will result in getting an interview, and sometimes it won’t ─ Variable Ratio
- Sales Commissions: Most people in sales work on a commission. Sometimes a commission is awarded for each and every sale or after a certain number of sales have been completed ─ Fixed Ratio
- CEO’s Yearly Bonus: Large corporations reward the CEO and other top executives with a yearly bonus based on company performance ─ Fixed Interval
- Pop Quizzes: When a teacher gives pop-quizzes, it means that some weeks there may be no quizzes, but on other weeks there may be two or even three. The amount of time between quizzes changes every week ─ Variable Interval
- In Coaching: To make sure the team doesn’t get spoiled, most coaches will not deliver praise each and every time a player makes a good play during a game. They prefer to reward good play some of the time, but not all of the time ─ Variable Ratio
- Studying: If we tracked studying across an academic year, we would see that studying increases as mid-terms and finals approach. After those exams however, studying drops off as students take a post-reinforcement pause ─ Fixed Interval
- The Gig Economy: Working in the gig economy means continuously applying for various jobs, but only winning a contract for a few ─ Variable Ratio
- Health Inspections: Some cities like to spot-check restaurants for health-code violations. Theat means that no one knows when an inspection will take place; it could be once per quarter, or several times during peak seasonal periods ─ Variable Interval
- Rewarding Homework: To help students develop better homework habits, a teacher rewards each student that completes 3 homework assignments with extra time during recess. ─ Fixed Ratio
- The Bi-weekly Paycheck: Many workers that are paid an hourly wage receive a paycheck every two weeks. This consistent bi-weekly payment schedule is predictable and also leads a predictable pattern of work ─ Fixed Interval
History and Origins of Instrumental Conditioning
One of the first psychologists to discuss how positive and negative consequences impact behavior was Edward Thorndike.
Thorndike stated that:
“Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation” (Gray, 2007, p. 106).
One interesting side note is that Thorndike did not initially coin the term Law of Effect until much later in his work (Catania, 1999).
As revealed by Catania, Thorndike’s early writings referred to the Law as “neural changes rather than changes in behavior” (p. 426).
Note Thorndike’s terminology in 1907:
“Connections between neurons are strengthened every time they are used with indifferent or pleasurable results and weakened every time they are used with resulting discomfort” (p. 166).
Regardless of when the Law of Effect was specifically stated, it originates from Thorndike’s research on how cats (and sometimes small dogs) escaped something he created called a “puzzle box.”
A puzzle box was designed in such a way that the only way to escape was by pressing on a panel or pulling on a loop, which then opened the door.
Thorndike would place the cat in the box and then record how long it took for it to escape. In the early trials, the cat would act chaotically. Eventually, it accidentally discovered how to escape.
However, as the number of trials increased, a trend emerged. The cat escaped increasingly rapidly.
The graph below shows the data for cat #12 in box A, based on the data presented in Thorndike’s 1898 publication (p. 15). As the graph shows, in the first trial, it took nearly 3 minutes for the cat to escape.
But after that, the cat started escaping quite rapidly. In the last few trials, it only took the cat a few seconds to escape.
However, by trial 13, the cat was escaping in less than 10 seconds. Thorndike conducted numerous studies similar to this one. They all revealed the same general trend.
From experiments like this one, the famous Law of Effect was derived and proceeded to have a tremendous impact on psychology and our understanding of human behavior.
Applications of Instrumental Conditioning
1. Treating Aggressive Behavior: Variable Interval
As Van Camp et al. (2000) explained, research on reinforcing individuals with behavioral or learning disabilities usually implements a dense fixed time (FT) schedule.
A dense schedule which rewards behavior frequently has several benefits. It leads to quick acquisition of the target behavior, which in the case of replacing physically aggressive behavior, has a high priority.
However, variable time (VT) schedules are more realistic “because caregivers often are unable to implement FT schedules with a high degree of integrity in the natural environment” (p. 546).
Therefore, Van Camp et al. (2000) compared the effectiveness of FT and VT schedules in treating two individuals with moderate to severe retardation. Both patients displayed physically aggressive and sometimes self-injurious behavior.
After training the staff in the implementation of both schedules and working with each patient, the results “indicated that VT schedules were as effective as FT schedules in reducing problem behavior” (p. 552).
The implications of the VT schedule being effective are significant:
“Caregivers who implement treatment in the natural environment have numerous demands on their time and, thus, are likely to implement VT schedules even when they were taught to use FT schedules” (p. 556).
2. Keeping Young Learners On-Task: Fixed Interval
Young children with typical learner profiles still have trouble staying on-task. They are easily distracted and find it difficult to maintain concentration.
Riley et al. (2011) examined the effectiveness of a fixed time (FT) schedule of reinforcement applied to two students. Both children were identified by their classroom teacher as having difficulty staying focused.
In the first phase of the study, the children’s on-task and off-task behaviors were carefully observed and recorded. Next, the teacher was trained in how to administer an FT-5-minute schedule.
So, every 5 minutes, the teacher praised on-task behavior and redirected the student’s attention if they were engaged in off-task behavior.
After analyzing the data, the authors concluded:
“This study demonstrates that FT attention delivery can be an effective strategy used to increase the on-task behaviors and decrease the off-task behaviors of typically-developing students” (p. 159).
3. Work Habits Of The U. S. Congress: Fixed Interval
Generally speaking, the U. S. Congress works on a fixed interval (FI) schedule, beginning in January and finishing at the end of the year.
If we consider their target behavior as the act of passing legislation, then we can examine their pattern of behavior throughout the year in the context of an FI schedule.
This is exactly what Critchfield et al. (2003) did. The researchers examined the rate of congressional bill production over a 52-year period, from 1949 to 2000.
The data revealed the post-reinforcement pause and scalloped pattern of behavior that is typical of FI schedules. The authors’ words describe the data succinctly:
“Across all years surveyed, few bills were enacted during the first several months of each session, and the cumulative total tended to accelerate positively as the end of the session approached. Across more than half a century, then, bills have been enacted in a distinct scalloped pattern in every session of each Congress” (p. 468).
Instrumental conditioning is a principle of learning in which the consequences of a behavior determine the likelihood of it occurring again.
Skinner’s extensive research identified four main schedules of reinforcement. Each schedule produces a different rate of behavior, a different rate of acquisition, and a different rate of extinction.
As it turns out, we can see examples of these schedules in many aspects of life. Workers on bi-weekly pay schedules and the productivity of Congress exhibit the same pattern of behavior which stems from a fixed interval schedule.
Trying to land a job interview or working as a freelancer in the gig economy are examples of a variable ratio schedule. Sometimes behavior is rewarded quickly, but other times it seems to take quite a bit of effort.
Health inspections and pop-quizzes can occur at unpredictable times because they follow a variable interval schedule. This keeps those involved fully alert and diligent.
People in sales are often paid a commission for each and every contract completed or after achieving a specific milestone, which puts them on a fixed ratio schedule and leads to strong and steady performance.
Instrumental conditioning explains a lot of human behavior. It can be used to help individuals with learning difficulties or improve typical students’ academic performance.
And it all started by studying how long it took a cat to escape a box.
Catania, A. C. (1999). Thorndike’s legacy: Learning, selection, and the law of effect. Journal of the Experimental Analysis of Behavior, 72(3), 425-428.
Critchfield, T. S., Haley, R., Sabo, B., Colbert, J., & Macropoulis, G. (2003). A half century of scalloping in the work habits of the United States Congress. Journal of Applied Behavior Analysis, 36(4), 465-486.
Dreikurs, R., & Stolz, V. (1991). Children: The challenge: The classic work on improving parent-child relations–intelligent, humane, and eminently practical. London: Penguin.
Dreikurs, R. (1987). Children: The challenge. New York: Dutton.
Dreikurs, R. C., & Grey, L. (1968). Logical consequences: A new approach to discipline. Los Angeles: Meredith Press.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts.
Gray, P. (2007). Psychology (6th ed.). New York: Worth Publishers.
Madden, G. J. (2012). APA Handbook of Behavior Analysis (APA Handbooks in Psychology). New York: APA.
Maggin, D. M., Chafouleas, S. M., Goddard, K. M., & Johnson, A. H. (2011). A systematic evaluation of token economies as a classroom management tool for students with challenging behavior. Journal of School Psychology, 49(5), 529-554.
Nelsen, J. (1996). Positive discipline. New York: Ballantine Books.
Nelsen, J. (2011). Positive discipline: The classic guide to helping children develop self-discipline, responsibility, cooperation, and problem-solving skills. Ballantine Books.
Reitman, D., Boerke, K., & Vassilopoulos, A. (2021). Token Economies. Handbook of Applied Behavior Analysis, 374.
Riley, J. L., McKevitt, B. C., Shriver, M. D., & Allen, K. D. (2011). Increasing on-task behavior using teacher attention delivered on a fixed-time schedule. Journal of Behavioral Education, 20(3), 149-162.
Skinner, B. F. (1965). Science and human behavior. New York: Free Press.
Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. The Psychological Review: Monograph Supplements, 2(4), i.
Thorndike, E. L. (1905). The elements of psychology. New York: A. G. Seiler.
Thorndike, E. L. (1907). The elements of psychology (2nd ed.). New York: A. G. Seiler.
Van Camp, C. M., Lerman, D. C., Kelley, M. E., Contrucci, S. A., & Vorndran, C. M. (2000). Variable‐time reinforcement schedules in the treatment of socially maintained problem behavior. Journal of Applied Behavior Analysis, 33(4), 545-557.
Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]