Magnitude of reward in selective learning
Magnitude of reward in selective learning'
JOHN H. MORRISON AND JOHN J. PORTER
UNIVERSITY OF WISCONSIN-MILWAUKEE
Ab8tract
The growth of incentive motivation was investigated
in a two choice discrimination using three groups of
10 Ss for 300 trials. The experimental group was trained
with 20 mg and 97 mg rewards. One control group
received 20 mg while the other control received 97 mg
on all trials . Asymptotic running speed was not achieved,
but the converging speeds of the control groups tended
to support the "habit" hypothesis. Both a negative
contrast effect, where the experimental Ss ran slower
to 20 mg than did the controls, and a positive contrast
effect, where the experimental Ss ran faster to 97 mg
than did the controls, was observed.
Problem
For Hull (1952) and Spence (1956), incentive motivation
(K) has been assumed to be a function of the classically
conditioned consummatory habit (rg). It follows that
a large reward leads to more rapid growth of rg
(and thus K) than a small reward, since, on each trial,
the consummatory response is practiced more often
when ingesting a large reward. However, when the rg
habit to the small reward reaches asymptote,performance should be the same to both small and large reward.
Thus, the rate of performance increase depends on
reward magnitude, while the asymptote of performance
does not (Spence, 1956, 140-141) .
Both Hull and Spence considered the alternative that
different rewards might produce different rgs, rather
than different rates of learning the same r Spence
(1956, 142-148) assumed that the "vigor' of the
consummatory response, rather than the amount of
practice, determined incentive motivation (K) . Thus,
the "vigor" hypothesis, in contrast to the "habit"
hypothesis, predicts that the asymptote of performance
is a function of the magnitude of reward. Most studies
of magnitude of reward seem to favor the "vigor"
hypothesis over the "habit" hypothesis (cf.: Festinger,
1943; Davenport,1964).However, the "habit" hypothesis
has not really been tested since none of the relevant
studies have used enough trials to assure asymptotic
performance.
Selective learning studies contrasting differential
reward magnitudes have suggested that running speed
is not a simple function of magnitude of reward, but is
rather a function of the difference between two rewards.
This "contrast effect" has been observed by Bower
(1961), Porter (1964), Clayton (1965), and Spear &
Hill (1965).
The present study attempted to answer two questions
about contrast effects and the "vigor" and "habit"
hypotheses in selective learning: With other factors
held constant does magnitude of reward determine
r
Psychon. Sci. , 196:i , Vol. 3
response speed and choice behavior, or are reward
effects temporary, with amount of practice of the
consummatory response determining final asymptotic
performance? Are constant effects temporary or permanent obstructions which mask the simple effects of
reinforcement?
.~ ... hod
Thirty naive female hooded rats, ISO days old when training
began, were used. Ss were run in a wedge-shaped choice chamber,
similar to that used by Ramond (1954). which was 12-in deep and
IS-in wide at the goal end. The chamber had retractable response
levers on either side. illuminated by a shielded 7-watt bulb such
that each lever was illuminated by the light above It. On free
choice trials both lights were on and both levers were in the
chamber. On forced trials, only one lever was in the chamber
and one light was on. Ss were timed from the opening of the start
box glass door until S actuated a contact relay by touching the bar.
The Ss were assigned to three groupsoflO each. one experimental
group and two control groups. The experimental group received a
97 mg Noyes pellet to the large reward side and a 20 mg pellet to
the small reward side. Both control groups received equal reward
to both sides of the chamber, the small reward group 20 mg and the
large reward group 97 mg. Onehalfofthe Ss in the 97-20 group were
assigned their large reward to their preferred side to balance
poSition preferences. For five weeks before the experiment began,
Ss were allowed 1 hr. free access to food after 23 hr. deprivation.
During this period. each S was llandled for 5 min. daily. For three
days prior to the actual training. Sa were allowed 5 min. exploration
of the apparatus. All Sa received four such exposures daily. On balf
of the exploratory trials a pellet was placed on the left side of the
chamber, the remainder of the time on the right. The experimental
Ss received 97 mg on half of these trials and 20 mg on the remaining
trials. Although the levers were not in the chamber during this
phase of training, both 7-watt bulbs were on.
Following pretraining, Sa received six trials a day for four days,
12 trials a day for two days, and IS trials a day thereafter . The Ss
were run one group at a time. After a block of six trials Ss were
returned to their home cages. In each block of six trials the sequence
of choice and forced trials was CCFFFF. !tesponse speeds were
measured on the last two forced trials in each block of six. On the
IS trials per day schedule Ss were forced twice to the large reward
on the fifth trial of one day and forced twice to the large reward
on the sixth trial the next day.
Re8.U8
The mean speeds of the control groups and of the
experimental group to both 97 mg and 20 mg appear in
Fig. 1, plottted in 5 blocks of 6 trials to e,a ch discriminanda. The points on the curves represent mean
speeds after 60, 120, 180, 240, and 300 trials.
Comparing the speeds of the experimental SS to 20
mg with the speeds of the two control groups demonstrated that performance depended upon both magnitude
of reward, F(2,27) =4.69, p< .05, and number of trials,
F(4,108) = 74.95, p< .001. A significant interaction between these variables, F(8 ,108) = 6.95, p< .001, demonstrated the changing relationship between the speeds of
the experimental and control groups over trials, and
prompted comparisons of the mean speeds between
:i31
()
l.U
(j)
1.5
A-A97E
"-
0
1.2
D-
.9
l.U
l.U
(j)
.······.97C
O-- ......
•
l.U
(j)
z
0
6 /
(j)
.3
D-
l.U
0::
0
020y
6-------620E
____
•••••••••••••••••••
•• '
.------A
•••••••
..............•11
._---::::::-:::::::::-0-
.O .............. -,,~-··
~~::~;:.;::~:~~~:~.,/
2
3
4
5
BLOCKS OF 60 TRIALS
Fig. 1. Mean speeds after each block of 60 trial for the two control groups (C) and the experimental grou p (El. Each point on the
figure represents the mean of 20 forced trials (10 per side) for the
control groups, and 10 forced trials to the 97 mg side and 10 forced
trials to the 20 mg side for the experimental groups.
groups at different stages of training. The critical
difference (CD) for all comparisons (Cochran & Cox,
1950) was calculated: CD(26,107) = 0.195, p = .05. After
180 trials the mean speed of the experimental Ss to 20
mg was significantly less than the speed of the 20-20
Ss, p = 0.271. After 2 (...truncated)