Partial reinforcement is a schedule in which the desired target behavior is rewarded only some of the time.
A partial or “intermittent” schedule tends to be more resistant to extinction than a continuous reinforcement schedule.
For example, if a child expects a gift every time they get an A on their homework, they may lose excitement for gifts because the stimulus response becomes an expectation rather than a treat.
But if they get the gift only sometimes they get an A, the excitement will exist every time, meaning the stimulus response (excitement) is maintained.
Partial Reinforcement Definition
Partial reinforcement is part of B. F. Skinner’s (1965) principles of operant conditioning and is used to shape or modify behavior. It refers to an intermittent reinforcement following a desired behavior (as opposed to regular reinforcement).
The partial reinforcement schedule is often used after a behavior has already been acquired in order to maintain the stimulus response (such as with the gift example above). Partial reinforcement strengthens the desired behavior and makes it less susceptible to extinction.
Skinner’s operant conditioning principles were based on Edward Thorndike’s Law of Effect (1898; 1905). Gray (2007) provides an excellent definition of the Law of Effect:
“Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation” (p. 106).
Partial Reinforcement Examples
- Sometimes Giving Gold Stars for Correct Behavior: A teacher wants to reward their students for good behavior, but doesn’t want them to expect a reward for acting the way they should. So, gold stars are given on a partial schedule.
- Sales Commissions: A company gives a commission at the end of each month or quarter based on the salesperson’s volume of sales for that period of time.
- Pop Quizzes: When a teacher gives random pop-quizzes they are rewarding students for studying on a partial schedule because the students sometimes study, and sometimes don’t.
- Attracting Online Clicks: Each member of the marketing team is given a small bonus for every 10,000 clicks on the company’s banner ad.
- Free Coffee: A coffee shop gives each customer a card to track their purchases. After 10 coffees, they get one free.
- Bi-weekly Paycheck: A lot of fast-food workers get paid every two weeks. So, their behavior is not rewarded continuously, but only after a certain period of time has elapsed.
- Checking for “Likes” on FB: Sometimes checking one’s FB post for “likes” is rewarded, but sometimes it’s not.
- Checking Conditioning of Athletes: A coach has the players come in during the 4-month off-season at random times to check their conditioning.
- Rewarding Homework: Some parents reward their children for doing homework at random times during the week to keep them on their toes and reduce the likelihood of desensitization of respondent conditioning.
- Asking Questions in Class: A professor has a habit of directing questions to randomly selected students during class. This means that sometimes they are rewarded for being prepared, and, sometimes not.
- Selling Girl Scout Cookies: Sometimes several houses in a row will purchase a box or two of cookies, and sometimes it seems like a whole block is not interested.
Types of Partial Reinforcement
1. Fixed-ratio (FR) reinforcement
For example, with a fixed ratio of 4, every 4th time the behavior occurs, a reward will be provided. This happens commonly at coffee shops, where loyal customers get their 10th coffee for free.
As shown in the above example, fixed-ratio reinforcements reward consistency, reliability, and loyalty.
2. Variable-ratio (VR) reinforcement
In a variable-ratio reinforcement schedule, a response is reinforced after an unpredictable rather than set number of instances of the behavior have occurred.
This type of reinforcement is commonly used in slot machines, where the player wins after an uncertain number of attempts. The uncertainty keeps the person pressing the button hoping that “next time, I’ll win!”
3. Fixed-interval (FI) reinforcement
Interval reinforcements refer to reinforcements that occur after an amount of time has passed rather than an amount of instances of the behavior. For example, they would happen after X amount of days.
In a fixed-interval reinforcement schedule, the response is reinforced with a reward after a set amount of time has passed since the last reinforcement. A good example of this is a yearly payrise at work. Employees would know that this is coming every January 1st (if they meet their KPIs!)
This reinforcement is more predictable than variable-interval, explained below.
4. Variable-interval (VI) reinforcement
In a variable-interval reinforcement schedule, a response is reinforced after an unpredictable amount of time in order to keep the person (or animal) on their toes.
The interval between reinforcements tends to vary but often stays within an averaged-out range.
An example of this type of partial reinforcement is pop quizzes. The professor might tell students that there will be a pop quiz coming up soon, but students don’t know when the next quiz will be, so they have to be prepared at all times.
Partial Reinforcement vs Continuius Reinforcement
|Criteria||Partial Reinforcement||Continuous Reinforcement|
|Definition||A reinforcement schedule where a response is only reinforced some of the time, rather than every time it occurs.||A reinforcement schedule where a response is reinforced every time it occurs.|
|Types||Fixed-ratio, variable-ratio, fixed-interval, variable-interval||N/A|
|Learning Speed||Slower initial learning, as the reinforcement is not consistent.||Faster initial learning, due to consistent reinforcement.|
|Resistance to Extinction||Higher resistance to extinction, as the learned behavior persists even when reinforcement is no longer provided.||Lower resistance to extinction, as the behavior is more likely to cease when reinforcement is no longer provided.|
|Examples||Slot machines (variable-ratio), pop quizzes (variable-interval)||Training a dog to sit by giving a treat every time the dog sits|
|Application||Useful in maintaining long-term behavior and promoting persistence, often seen in gambling, sales, and studying habits.||Effective for teaching new behaviors and ensuring rapid acquisition of skills, often seen in animal training and early stages of learning.|
|Psychological Effect||May lead to increased persistence and stronger resistance to extinction due to the unpredictability of reinforcement.||May lead to a strong association between the behavior and reinforcement, but may also result in less persistence when reinforcement is no longer provided.|
Case Studies of Partial Reinforcement
1. Slot Machines
Slot machines may be one of the best examples of a partial reinforcement schedule that strengthens behavior and makes it highly resistant to extinction. The probability of a pay-off is programmed with amazing precision.
Although each machine can be independently programmed, they usually adhere to a variable ratio (VR) schedule of reinforcement.
This means that it takes an unpredictable number of lever-pulls to get a reward.
For example, one machine may be set on a VR-120 schedule. That means that on average, it will produce one payoff for every 120 times played.
However, because it is a variable ratio schedule, the actual number of times needed for a payoff will change. One time it might be at 90, one time it might be at 55, and another time it might be at 155.
But, the average of all of those over the long haul will equal 120.
2. Keeping Young Learners On-Task
Students always have trouble maintaining focus. This is especially true when it comes to young learners. Young children are so easily distracted that at times it seems nearly impossible to keep them on-task.
Riley et al. (2011) applied a partial reinforcement schedule called the fixed interval (FT) schedule to two students identified by their teacher as having an especially hard time staying focused.
The fixed interval schedule means that the target behavior is reinforced after a specific period of time has elapsed.
First, the researchers recorded the children’s on-task and off-task behaviors during a baseline period. Next, the teacher applied reinforcement at the end of every 5 minutes.
This reinforcement took the form of offering praises for on-task behavior or redirecting the student’s attention for off-task behavior.
The authors concluded that:
“This study demonstrates that FT attention delivery can be an effective strategy used to increase the on-task behaviors and decrease the off-task behaviors of typically-developing students” (p. 159).
3. Predator Success Rates
The success rate for some of the planet’s most fierce predators is actually quite low. While the success rate for more docile creatures, like the domestic cat, is quite high.
From an operant conditioning perspective, reward operates on a variable ratio (VR) schedule. The number of attempts at winning food is going to vary.
For example, a cheetah may have to chase down prey 20 times before finally being rewarded. But then the very next time, they are successful.
Since the cheetah has no way of predicting when they will have success or not, they must try hard each and every time. The same it true for all wild animals because success is on a partial reinforcement schedule.
4. Helping Children with Emotional and Behavioral Disorders
Children with severe emotional and behavioral disorders are often placed in a specialized day-treatment or hospital-based program. This allows them to receive the extra attention and instruction they need.
Rasmussen and O’Neill (2006) implemented a fixed-time (FT) schedule of reinforcement to decrease disruptive behavior of children in these programs.
FT means that behavior was rewarded only after a specific period of time elapsed.
The study included three children ages 8-12 years old that were part of a day-treatment classroom with seven to nine other students.
The classroom contained a special education teacher and two psychiatric technicians.
The students engaged in regular academic activities such as writing, math, or social studies during three to four 10-minutes sessions each day, 5 days per week.
The lead teacher provided either verbal praise or a pat on the arm every 10s or 20s when the student exhibited desired behavior. Disruptive behaviors were ignored.
The impact on disruptive behavior was significant:
“Implementation of FT schedules resulted in immediate, substantial, and stable decreases for all participants” (p. 455).
5. Work Habits of the U. S. Congress
The U. S. Congress works on a fixed interval (FI) schedule. They have vacations at specific times during the year, which serves as a reward.
One interesting feature of this schedule is known as the “scalloped pattern” of behavior. It simply means that after each reward, behavior decreases slightly, then increases as the next reward interval approaches.
Critchfield et al. (2003) analyzed the behavior of the U. S. Congress over a 52-year period, from 1949 to 2000.
As it turns out, productivity increased as vacation time approached. After returning from vacation, productivity was low and then gradually increased as the next vacation time approached.
“Across all years surveyed, few bills were enacted during the first several months of each session, and the cumulative total tended to accelerate positively as the end of the session approached. Across more than half a century, then, bills have been enacted in a distinct scalloped pattern in every session of each Congress” (p. 468).
A partial reinforcement schedule involves delivering a reward after an unpredictable period of time or number of behaviors. This usually results in slow acquisition of behavior, but once established, it is highly resistant to extinction.
Partial reinforcement schedules appear in many forms in our everyday lives. For instance, teachers and parents sometimes reward children for being good, but not always.
Companies will often put their sales staff on a partial schedule that will reward them at the end of each month or quarter. This motivates the staff to work hard to earn a commission or bonus.
Slot machines and predators operate on a partial schedule known as the variable ratio because the next payoff is unpredictable.
Children with and without learning disabilities can benefit by being placed on a partial schedule of reinforcement. When their teacher rewards positive behavior, those behaviors will be more likely to occur again.
Critchfield, T. S., Haley, R., Sabo, B., Colbert, J., & Macropoulis, G. (2003). A half century of scalloping in the work habits of the United States Congress. Journal of Applied Behavior Analysis, 36(4), 465-486.
Ferster, C. B., & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century-Crofts.
Gray, P. (2007). Psychology (6th ed.). Worth Publishers, NY.
Jablonsky, S. F., & DeVries, D. L. (1972). Operant conditioning principles extrapolated to the theory of management. Organizational Behavior and Human Performance, 7(2), 340-358.
Rasmussen, K., & O’Neill, R. E. (2006). The effects of fixed-time reinforcement schedules on problem behavior of children with emotional and behavioral disorders in a day-treatment classroom setting. Journal of Applied Behavior Analysis, 39, 453-457.
Riley, J. L., McKevitt, B. C., Shriver, M. D., & Allen, K. D. (2011). Increasing on-task behavior using teacher attention delivered on a fixed-time schedule. Journal of Behavioral Education, 20(3), 149-162.
Skinner, B. F. (1965). Science and human behavior. New York: Free Press.
Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in animals. The Psychological Review: Monograph Supplements, 2(4), i.
Thorndike, E. L. (1905). The elements of psychology. New York: A. G. Seiler.