Affective–associative two-process theory: a neurocomputational account of partial reinforcement extinction effects
Affective-associative two-process theory: a neurocomputational account of partial reinforcement extinction effects
Robert Lowe 0 1
Alexander Almér 0 1
Erik Billing 0 1
Yulia Sandamirskaya 0 1
Christian Balkenius 0 1
0 Cognitive Science, Lund University , Lund , Sweden
1 Department of Applied IT, University of Gothenburg , Gothenburg , Sweden
The partial reinforcement extinction effect (PREE) is an experimentally established phenomenon: behavioural response to a given stimulus is more persistent when previously inconsistently rewarded than when consistently rewarded. This phenomenon is, however, controversial in animal/human learning theory. Contradictory findings exist regarding when the PREE occurs. One body of research has found a within-subjects PREE, while another has found a within-subjects reversed PREE (RPREE). These opposing findings constitute what is considered the most important problem of PREE for theoreticians to explain. Here, we provide a neurocomputational account of the PREE, which helps to reconcile these seemingly contradictory findings of within-subjects experimental conditions. The performance of our model demonstrates how omission expectancy, learned according to low probability reward, comes to control response choice following discontinuation of reward presentation (extinction). We find that a PREE will occur when multiple responses become controlled by omission expectation in extinction, but not when only one omission-mediated response is available. Our model exploits the affective states of reward acquisition and reward omission expectancy in order to differentially classify stimuli and differentially mediate response choice. We demonstrate that stimulusresponse (retrospective) and stimulus-expectation-response
Partial reinforcement; Reinforcement learning; Decision making; Associative two-process theory; Affect
-
2 Institutionen för informationsteknologi, Högskolan i Skövde,
Skövde, Sweden
3 Institute of Neuroinformatics, Neuroscience Center Zurich,
University and ETH Zurich, Zurich, Switzerland
(prospective) routes are required to provide a necessary and
sufficient explanation of the PREE versus RPREE data and
that Omission representation is key for explaining the
nonlinear nature of extinction data.
1 Introduction
The partial reinforcement extinction effect (PREE) is
characterized by a tendency for subjects to perseverate in
behavioural responding to a greater degree when the
behaviour was previously probabilistically/infrequently
rewarded as compared to when it was unconditionally/
frequently rewarded. These partial, as compared to
continuous, schedules of reinforcement are critical for gaining
insights into how a history of behaviour can bring to bear
when circumstances change. Furthermore, intermittent
reinforcement is the norm in natural environments
(Pipkin and
Vollmer 2009)
.
The PREE has been studied since the 1940s and 1950s
(Mowrer and Jones 1945; Grosslight and Child 1947; Jenkins
and Rigby 1950; Amsel 1958)
. It has been identified using a
two-phase training assessment of behavioural history: (1) an
acquisition phase where subjects are rewarded for engaging
one of a number of response options in relation to a specific
stimulus cue, (2) an extinction phase where subjects are no
longer rewarded (or have diminished rewards) for
responding. The PREE has been explained in terms of the number of
expected reinforcers omitted during extinction
(Gallistel and
Gibbon 2000; Nevin 2012)
so that multiple response choices
in the extinction phase are required to be able to disconfirm
probabilistic expectations learned in the acquisition phase.
Thus, partial reinforcement (PRF) schedules require more
responses than continuous reinforcement (CRF) schedules
for such disconfirmation to be possible.
Nevertheless, controversies exist in the literature. The
findings of a PREE given the above-mentioned comparison
of CRF versus PRF schedules have been most consistently
found in between-subjects investigations
(Mowrer and Jones
1945; Grosslight and Child 1947; Svartdal 2008)
, i.e. when
one set of subjects are tested on the CRF and a different set
of subjects are tested on the PRF. The PREE has also been
found using a within-subjects design
(Kruse and Overmier
1982; Rescorla 1999; Nevin and Grace 2005a,b)
. However,
within-subjects scenarios have also found a reversed PREE
(RPREE) phenomenon. In this case, responding on the CRF
schedule has actually been more resistant to extinction than
the PRF schedule. The contradictory PREE and RPREE
findings have been described as “[t]he outstanding difficulty” for
PREE theory (Case 2000, p. 93).
1.1 Theories for PREEs
There are several theories that attempt to address the
underlying process of partial reinforcement effects on acquisition
and extinction including those that attempt to address the
contradictory PREE versus RPREE data, e.g.
Nevin (1988)
;
Nevin and Grace (2000)
and behavioural momentum theory,
and the sequential (...truncated)