Schedules of Reinforcement

In the Skinner-box it is possible to change the contingency between the responses and the delivery of reinforcement so that more than one response may be required in order to obtain the reward. A whole range of rules can govern the contingency between responses and reinforcement - these different types of rules are referred to as schedules of reinforcement. Most of these schedules of reinforcement can be divided into schedules in which the contingency depends on the number of responses and those where the contingency depends on their timing.

Schedules that depend on the number of responses made are called ratio schedules. The ratio of the schedule is the number of responses required per reinforcement. The "classic" schedule, where one reinforcer is delivered for each response, is called a continuous reinforcement schedule - it has a ratio of 1. A schedule where two responses had to be made for each reinforcer has a ratio of 2 and so on. A distinction is also made between schedules where exactly the same number of responses have to be made for each reinforcer - fixed-ratio schedules, and those where the number of response required can differ for each reinforcer around some average value - a variable-ratio schedule. A schedule where exactly 20 responses were required for each reinforcer is called a fixed-ratio 20 or FR20 schedule. One where on average 30 response are required is called a variable-ratio 30 or VR30 schedule.

If the contingency between responses and reinforcement depends on time, the schdule is called an interval schedule. Reinforcing the first response an animal makes after the SD light has been on for 20 seconds and ignoring responses it makes during that 20 seconds would correspond to such a schedule. Where the interval which must elapse between the onset of the SD and the first reinforced response is the same for all reinforcers the schedule is called a fixed-interval or FI schedule. Again, the intervals could also vary around some average - this is called a variable-interval or VI schedule. It s possible to combine these schedules in various ways and even to construct other basic types of schedule (e.g. ones where animals are reinforced for maintaining specified intervals between responses - differential reinforcement of low rate of response or DRL schedules). The important thing about these different schedule, however, is the differences in response patterns and learning that they produce. These differences may tell us about part of what is learned in operant conditioning. A summary of the basic types of schedules might be useful:

Fixed-Ratio (FR) in which the first response made after a given number of responses have been in the presence of the discriminative stimulus is reinforced. For example on an FR 15 schedule every 15th response is reinforced.
Fixed-Interval (FI) in which the first response made after a given time interval is reinforced. For example, on an FI 20 sec. schedule the first response made after 20 seconds from the onset of the discriminative stimulus is reinforced. The discriminative stimulus would normally then be turned off during the period the animal consumes its reinforcer.
Variable-Ratio (VR) is similar to FR except that the number of responses required varies between reinforcements. On a VR 15 schedule 15 responses are required per reinforcer on average, but one reinforcer may only require 3 responses while the next is obtained after 22 responses.
Variable-Interval (VI) is similar to FI except the interval requirements vary between reinforcers around some specified average value.

The most characteristic response patterns are produced by FI and FR schedules. Responses in operant-conditioning experiments were traditionally recorded using a pen recorder (see figure in the Skinner-Box document) in which a pen was drawn across paper at a constant rate, the pen was moved up a small amount each time an animal made a response, a larger diagonal movement recorded the occurrence of reinforcements. Animals produce constant rate responding to FR schedules with a distinct pause in responding after each reinforcement.

FR cumulative response

The rate of response is inversely proportional to the ratio requirement. The length of the post-reinforcement pause in responding also increases as the ratio increases. The pattern of animal responding on FI schedules is quite different. After each reinforcement animals respond on FI schedules with gradually accelerating response rates which produces a 'scalloped' record:

an FI cumulative response

The main feature of variable schedules is that, in animals, ratio schedules produce larger response rates than interval schedules for the same reinforcement density. For example, one animal might be trained on a variable ratio schedule and the times at which it received reinforcement could be noted. These time could then be used to form a 'yoked' variable interval schedule for another animal - an interval schedule where the interval between SD onset and the onset of a response-reinforcement contingency is determined by the times at which the first animal received each reinforcement. Typically the second animal would produce much slower response rates on the yoked schedule even though the frequency of reinforcement received by the two animals was more or less the same.

This document has been restructured from a lecture kindly provided by R.W.Kentridge.