Bi-directional effect of increasing doses of baclofen on reinforcement learning
Original Research Article
published: 22 July 2011
doi: 10.3389/fnbeh.2011.00040
BEHAVIORAL NEUROSCIENCE
Bi-directional effect of increasing doses of baclofen on
reinforcement learning
Jean Terrier †, Andres Ort †, Cédric Yvon, Arnaud Saj, Patrik Vuilleumier and Christian Lüscher*
Department of Basic Neurosciences, Medical Faculty, University of Geneva, Geneva, Switzerland
Edited by:
Riccardo Brambilla, San Raffaele
Scientific Institute and University, Italy
Reviewed by:
Carmen Sandi, École Polytechnique
Fédérale de Lausanne, Switzerland
Nicola Canessa, Vita-Salute San
Raffaele University, Italy
*Correspondence:
Christian Lüscher, Department of Basic
Neurosciences, Medical Faculty,
University of Geneva, CMU 1, Rue
Michel Servet, CH-1211 Geneva,
Switzerland.
e-mail:
Jean Terrier and Andres Ort have
contributed equally to this work.
†
In rodents as well as in humans, efficient reinforcement learning depends on dopamine (DA)
released from ventral tegmental area (VTA) neurons. It has been shown that in brain slices of
mice, GABAB-receptor agonists at low concentrations increase the firing frequency of VTA–DA
neurons, while high concentrations reduce the firing frequency. It remains however elusive
whether baclofen can modulate reinforcement learning in humans. Here, in a double-blind study
in 34 healthy human volunteers, we tested the effects of a low and a high concentration of oral
baclofen, a high affinity GABAB-receptor agonist, in a gambling task associated with monetary
reward. A low (20 mg) dose of baclofen increased the efficiency of reward-associated learning
but had no effect on the avoidance of monetary loss. A high (50 mg) dose of baclofen on the
other hand did not affect the learning curve. At the end of the task, subjects who received 20 mg
baclofen p.o. were more accurate in choosing the symbol linked to the highest probability of
earning money compared to the control group (89.55 ± 1.39 vs. 81.07 ± 1.55%, p = 0.002). Our
results support a model where baclofen, at low concentrations, causes a disinhibition of DA
neurons, increases DA levels and thus facilitates reinforcement learning.
Keywords: instrumental learning, mesolimbic dopamine system, reward-prediction error, baclofen, bi-directional effect,
ventral tegmental area, anti-craving treatment
Introduction
In his paper on “The Law of Effect,” Thorndike stipulated that:
“of several responses made to the same situation, those which
are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with
the situation, so that, when it recurs, they will be more likely to
recur” (Thorndike, 1898). Since then, it has been suggested that the
mesolimbic dopamine (DA) system is involved in this learning by
coding for a “reward-prediction error” (Schultz et al., 1997). The
mesocorticolimbic DA system originates in the ventral tegmental
area (VTA), which projects to the nucleus accumbens (NAc) and
the prefrontal cortex. Under physiologic conditions, mesocorticolimbic projections release DA in response to natural rewards such
as food and sex, which are critical for the survival of the species.
This process reflects the fact that it is important for an organism to learn the circumstances under which rewards are obtained
(Balland and Lüscher, 2008). When an external reward is delivered,
DA neurons elicit a strong learning signal indicating whether the
value of the current state is better or worse than predicted (Schultz
et al., 1997), rather than euphoria or pleasure (Balland and Lüscher,
2008). This signal therefore allows rapid acquisition of predictive
cues and efficient behaviors that are successful in obtaining rewards
(Bechara et al., 1998).
Evidence that this system can be pharmacologically modulated
by changes in DA function has been provided by Pessiglione et al.
(2006). In their study, human volunteers carried out a learning task
that involved money gains and losses while functional magnetic
resonance images (fMRI) were collected. When mesocorticolimbic
DA was boosted by l-DOPA, the participants learned faster and
earned more money. Conversely, when DA signaling was inhibited
Frontiers in Behavioral Neuroscience
by haloperidol, participants learned slower and earned less money
compared to the control group. Interestingly, no shift of the learning
curves was observed when participants were in the loss condition,
which suggests that other processes are involved in aversive learning.
In a separate study using the Iowa gambling task, an activation of
the ventral striatum has also been shown by fMRI (Li et al., 2010).
The effect of DA on learning can be explained by a modulation
of the mesocorticolimbic system of circuits involved in action planning and decision-making. In many mammals, at least two systems
exist to predict the value of an action: the planning or explicit
system, which takes a given situation, predicts an outcome and
evaluates that outcome; and the habit or implicit system, which
takes a given situation and identifies the best remembered action to
take (Redish et al., 2008). The flexible planning system involves the
ventral and dorsomedial striatum, the prelimbic medial prefrontal
cortex and the orbitofrontal cortex, as well as the entorhinal cortex
and hippocampus, with an involvement of DA inputs from the VTA.
The habit system involves the dorsolateral striatum, the infralimbic medial prefrontal cortex as well as the parietal cortex, with an
involvement of DA inputs from the pars compacta of the substantia
nigra (SNc; Redish et al., 2008). The mesocorticolimbic system thus
has a central role in evaluating the value of predicted outcomes
during decision-making and planning. An over-evaluation of a predicted value by the DA system might alter the decision system leading to addictive behaviors (Redish et al., 2008). Another mechanism
leading to automatic decision-making and even addiction could be
the recruitment of the habit system by the NAc via feedback loops
to the dorsal striatum (Koob and Volkow, 2010). Understanding
how modulation of DA can alter valuation and decision-processing
therefore has profound implication for understanding motivated
www.frontiersin.org
July 2011 | Volume 5 | Article 40 | 1
Terrier et al.
Baclofen modulates reinforcement learning
behaviors and addiction. Here we propose to pharmacologically
modulate DA release with the GABAB-receptor agonist baclofen and
observe the effect of this modulation on an instrumental learning
task.
Baclofen (p-chlorophenyl-GABA) acts as a high affinity
g-aminobutyric acid type B (GABAB) receptor agonist. Its primary action as spasmolytic agent is mediated by increasing K+
conductance that results in postsynaptic inhibition (Cruz et al.,
2004; Katzung, 2009). In addition, baclofen causes presynaptic
inhibition by reducing Ca2+ influx and the release probability of
excitatory transmitters in the brain and spinal cord (Katzung,
2009). Interestingly, baclofen may also modu (...truncated)