The partial-reinforcement extinction effect and the contingent-sampling hypothesis
Psychon Bull Rev (2013) 20:1336–1342
DOI 10.3758/s13423-013-0432-1
BRIEF REPORT
The partial-reinforcement extinction effect
and the contingent-sampling hypothesis
Guy Hochman & Ido Erev
Published online: 18 April 2013
# Psychonomic Society, Inc. 2013
Abstract The partial-reinforcement extinction effect
(PREE) implies that learning under partial reinforcements
is more robust than learning under full reinforcements.
While the advantages of partial reinforcements have been
well-documented in laboratory studies, field research has
failed to support this prediction. In the present study, we
aimed to clarify this pattern. Experiment 1 showed that
partial reinforcements increase the tendency to select the
promoted option during extinction; however, this effect is
much smaller than the negative effect of partial reinforcements on the tendency to select the promoted option during
the training phase. Experiment 2 demonstrated that the
overall effect of partial reinforcements varies inversely with
the attractiveness of the alternative to the promoted behavior: The overall effect is negative when the alternative is
relatively attractive, and positive when the alternative is
relatively unattractive. These results can be captured with
a contingent-sampling model assuming that people select
options that provided the best payoff in similar past experiences. The best fit was obtained under the assumption that
similarity is defined by the sequence of the last four
outcomes.
Keywords Choice behavior . Judgment . Decision making .
Reinforcement learning
G. Hochman (*)
Duke University, 2024 W. Main Street, Erwin Mill Bldg., Bay C,
Durham, NC 27705, USA
e-mail:
G. Hochman
Interdisciplinary Center (IDC) Herzliya, Herzliya, Israel
I. Erev
Max Wertheimer Minerva Center for Cognitive Studies,
Technion—Israel Institute of Technology, Haifa, Israel
The partial-reinforcement extinction effect (PREE; Humphreys, 1939) is one of the best examples of a basic behavioral phenomenon, detected in the laboratory, with
potentially important practical implications. As most introductory psychology textbooks explain, the PREE refers to
the fact that learned behavior is more robust to extinction
when not all responses are reinforced (partial schedules)
than when 100 % of responses are reinforced in training
(full schedule; see, e.g., Atkinson, Atkinson, Smith, Bem, &
Nolen-Hoeksema, 1995; Baron & Kalsher, 2000). For example, Atkinson et al. stated that partial reinforcements
facilitate higher performance rates, since the probability that
individuals will continue responding in the absence of reinforcements is much higher under partial schedules than
under full schedules.
Unfortunately, however, many empirical studies fail to
support this textbook assertion. Most early demonstrations
of the PREE used between-subjects laboratory designs (e.g.,
Grosslight & Child, 1947; Mowrer & Jones, 1945; Pavlik &
Flora, 1993) to show that under partial-reinforcement schedules, individuals tend to engage in more responses during
extinction than under full schedules. However, most laboratory studies using within-subjects designs (Nevin, 1988;
Papini, Thomas, & McVicar, 2002; Svartdal, 2000; but see
Exp. 3 of Nevin & Grace, 2005, for an exception) and field
research (Latham & Dossett, 1978; Pritchard, Hollenback,
& DeLeo, 1980; Yukl, Latham, & Pursell, 1976) have
reported that partial reinforcements impair, rather than improve, performance. For example, Yukl et al. found that tree
planters were less productive under partial- than under fullreinforcement schedules. This negative effect of partial
schedules was even observed when partial reinforcements
yielded higher average payoffs.
Nevin (1988, and see Nevin & Grace, 2000) proposed
behavioral momentum theory in order to account for the
mixed PREE results. According to this account, two effects
compete under partial-reinforcement schedules. On the one
hand, partial reinforcements have an overall negative effect
Psychon Bull Rev (2013) 20:1336–1342
on the likelihood of selecting the reinforced alternative, due
to a decrease in reinforcement rates. At the same time,
however, partial reinforcements have a positive local effect,
as they slow extinction due to a generalization decrement,
which hinders detection of changes in the reinforcement
schedule in some settings (see the related observations in
Gershman, Blei, & Niv, 2010). Thus, momentum theory
suggests that the apparent inconsistency between basic research and field studies of PREE can be explained by the
assertion that classical demonstrations of PREE focused on
the positive local effect of partial reinforcement in slowing
extinction, whereas field studies document the overall
(negative) effect of partial reinforcements.
The main goal of the present analysis was to clarify and
extend Nevin and Grace’s (2000) explanation of the mixed
PREE findings. We relate Nevin and Grace’s (2000) assertion
to the suggestion that people tend to select the action that has
led to the best outcomes in similar situations in the past (see
Biele, Erev, & Ert, 2009; Gonzalez, Lerch, & Lebiere, 2003;
as well as a related observation by Patalano & Ross, 2007),
and elucidate the conditions under which partial reinforcements are likely to be effective and countereffective.
Experiment 1: evaluation of the positive and negative
effects of partial reinforcement
In most previous demonstrations of the PREE (e.g., Grant,
Hake, & Hornseth, 1951), the expected benefit from the
reinforced choice was higher under full than under partial
schedules, because the same magnitudes were administered
at higher rates. In the present study, we avoided this confound by manipulating the size of the rewards to ensure
equal sums of reinforcements under both partial and full
schedules. The primary goal of Experiment 1 was to examine whether the overall negative effect of partial reinforcements could be observed, even when this condition did not
imply a lower sum of reinforcements.
Method
Participants A group of 24 undergraduates from the Faculty
of Industrial Engineering and Management at the Technion
served as paid participants in the experiment. They were
recruited via signs posted around campus for an experiment
in decision making. The sample included 12 males and 12
females (mean age 23.7 years, SD = 1.88).
Apparatus and procedure For the experiment we used a
clicking paradigm (Erev & Haruvy, 2013), which consisted
of two unmarked buttons and an accumulated payoff counter. Each selection of one of the two keys was followed by
three immediate events: a presentation of the obtained
1337
payoff (in bold on the selected button for 1 s), a presentation
of the foregone payoffs (on the unselected button for 1 s),
and a continuous update of the payoff counter (the addition
of the obtained payoff to the counter). The exact payoffs
were a function of the reinforcement schedule, the phase,
and the choice, as explained below.
Participants were instructed to repeatedly cho (...truncated)