Reward magnitude and instrumental responses: Consistent and partial reward
Reward magnitude and instrumental responses:
Consistent and partial reward 1
GAR VIN McCAIN, University of Texas at
Arlington, Arlington, Tex. 76010
Two studies using partial or consistent
large (500 mg) and small (45 mg) reward
are presented. 1n both studies, after
extended re ward acquisition, differences
are negligible or nonexistent. Results from
the partial re ward groups indicate
extinction differences also disappear after
extended training. These results do not
seem to be in line with usual assumptions
regarding the effects of re ward magnitude.
Over the past 2 years aseries of studies
involving the effects of reward magnitude
have been run in this laboratory. This is a
preliminary re port giving only two of these
studies. Several others have been
summarized elsewhere (McCain, 1969).
The problem of the effects of reward
magnitude is critical to a number of
interpretations of learning, and the
empirical relations are important to alm ost
any learning analysis. The two studies
presented followed several studies in which
the usual assumption that larger rewards
produce more vigorous responses during
acquisition was questioned.
EXPERIMENT 1
This study, as well as a number of other
studies in the series, was run as a
reward-shift study. The reward-shift data
will be presented in a different context.
The foeus of this study is on comparison of
aequisition effects of large (500 mg) and
small (45 mg) consistent re ward.
Subjects and Apparatus
The Ss were 24 rats of the Wistar strain
from the colony of the University of Texas
at Arlington. All Ss were approximately 90
days old at the beginning of training.
Approximately equal numbers of eaeh sex
were used. The straight alley was
approximately 6 ft long. Four successive
time measures were taken in the alley; the
first was for a 12-in. seetion beginning
about 12 in. from the startbox door, the
second was about 18 in., and the goal
measure was approximately 10 in.,
terminating 8 in. from the goal cup. The
fourth measure incJuded the time from the
startbox door to 8 in. from the goal cup.
Procedure
Ss were plaeed on 23-h deprivation on
Day 1. On Days 2-6, Ss were handled in
Fig. 1. Running times for groups on
4S-mg and SOO-mg consistent reward
schedule.
Psychon. Sei., 1970, Vol. 19 (3)
groups for 1 h daily. Food was available on
the handling table during this hour. On
Day 7, Ss were assigned to groups on a
random basis. Each S explored the runway
for 5 min daily on Days 7-9. After
exploration, Ss were returned to their
horne cages and, about 15 min later, each S
received his appropriate training reward in
a goal cup. A few minutes later, an
appropriate amount oflab blocks was given
to total 15 g when added to the reward. Ss
were maintained on 15-g total daily food
for the remainder of the study. On Days 10
and 11, each S received two trials per day
and four trials on Days 12 and 13. Six
trials per day were given throughout the
remainder of training. Ss were brought into
the running room in squads of six in a
carrying cage with individual
compartments. Each S was given a single
trial in rotation. This procedure gives an
intertrial interval of approximately 6 min.
Equal numbers of Ss from eaeh group were
arranged in a random order in each squad.
The two groups were designated on the
basis of their reward schedule, 1-500 (a
single 500-mg Noyes pellet on each trial)
and 1-45 (one 45-mg Noyes pellet per
trial). All Ss received 54 acquisition trials.
Results
. Figure 1 shows the acquisition data for
the full runway measure. As may be noted,
Group 1-500 has substantially shorter
running times over the early stages of
training. Analysis of the first 4 days of
acquisition gives a significant difference
(F = 6.94, df= 1/22, p< .02). Later in
training, there is very little apparent
difference in performance of Groups 1-500
and 1-45. An analysis of the data
confrrmed the impression (F = 1.01).
EXPERIMENT 2
The series of studies involving consistent
reward with different magnitudes suggests
that the partial reinforcement situation
should also be investigated. Wagner (1961)
found that larger rewards gave more
vigorous performance and greater
resistance to extinction when given on a
random partial reinforcement schedule.
Wagner ran two levels of training, 16 and
60 trials. The present study included a
group given extended training, since
extended training appears to change the
consistent reward situation.
Subjects and Apparatus
The Ss were 44 rats of the Long-Evans
strain from the colony of the University of
Texas at Arlington. All Ss were
approximately 90 days old at the beginning
of training. Approximately equal numbers
of each sex were used. The apparatus was
the same as that used in Experiment 1.
Procedure
Ss were placed on 23-h deprivation on
Day 1 and handled and fed on Days 2-6. Ss
explored the test alley on Days 7-10 for
approximately 5 min daily. On Day 9, Ss
were divided into two groups, on a random
basis. Group 45 was to receive one 45-mg
Noyes pellet as reward and Group 500 was
to receive one 500-mg pellet. On Days 9
and 10, Ss received one goalbox (GB)
placement per day with the appropriate
reward. On Day 11, each S received two
running trials, four trials on Day 12, and
six trials per day thereafter. A schedule of
50% randorn reward was used. Ss were
20
uw
!!2.
w
~
D-----{J
145
•
1-500
•
15
z
«
o
UJ
~
5
Z
«
w
~
2
3
4
5
6
7
8
9
10
DAYS
139
G
u.J
25
20
45 Short
V)
W
~
500 Short
20
\~
the small reward group declines slightly but
not significantly more than the
large-reward group.
Z
z
z
15
::J
a::
Z
Fig. 2. a and b. Running tim es for
groups given partial 45- or 500-mg reward
at two levels of aequisition training.
DISCUSSION
Taken alone, these two studies are not
u.J
sufficient to convince anyone that the
::E
effects of reward magnitude are sharply
EXTINCTlON
z 5
decreased or disappear after extended
<t:
u.J
training. Aseries of nine other consisten t
::E
reward studies has been run in our
laboratory. Six of these studies have been
1-2
4
3
5
6
7
8
9
10 11
12
put into one paper that is now under
editorial consideration. In addition, Black
(in press) and Bloom & Milstead (1969)
DAYS
have substantial evidence that, during
acquisition, magnitude differences
15
disappear under some conditions. A
0 - - 0 45 Lang
2b
reasonable conc1usion is that after about
500 lang
60 consistently reinforced acquisition trials
G
in a straight alley, the acquisition effects of
u.J
!Odifferent reward magnitudes are either
u.J
minimal or absent. Our other studies
~
10
indicate that extinction differences
I-ACQUISITION
disappear or become minimal after about
o
100+ consistently reinforced acquisition
z
trials. The situation as regards random
z
Z
partial reinforcement is not so c1ear. The
::J
a::
study presented here is our only complete
EXTINCTION
study of its kind. Single studies must, of
z
5
<t:
necessity, be received with substantial
Ci
skepticism. Further work is under way.
u.J
::E
A number of possible c (...truncated)