Bi-directional effect of increasing doses of baclofen on reinforcement learning (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.frontiersin.org/articles/10.3389/fnbeh.2011.00040/pdf

Bi-directional effect of increasing doses of baclofen on reinforcement learning

Original Research Article published: 22 July 2011 doi: 10.3389/fnbeh.2011.00040 BEHAVIORAL NEUROSCIENCE Bi-directional effect of increasing doses of baclofen on reinforcement learning Jean Terrier †, Andres Ort †, Cédric Yvon, Arnaud Saj, Patrik Vuilleumier and Christian Lüscher* Department of Basic Neurosciences, Medical Faculty, University of Geneva, Geneva, Switzerland Edited by: Riccardo Brambilla, San Raffaele Scientific Institute and University, Italy Reviewed by: Carmen Sandi, École Polytechnique Fédérale de Lausanne, Switzerland Nicola Canessa, Vita-Salute San Raffaele University, Italy *Correspondence: Christian Lüscher, Department of Basic Neurosciences, Medical Faculty, University of Geneva, CMU 1, Rue Michel Servet, CH-1211 Geneva, Switzerland. e-mail: Jean Terrier and Andres Ort have contributed equally to this work. † In rodents as well as in humans, efficient reinforcement learning depends on dopamine (DA) released from ventral tegmental area (VTA) neurons. It has been shown that in brain slices of mice, GABAB-receptor agonists at low concentrations increase the firing frequency of VTA–DA neurons, while high concentrations reduce the firing frequency. It remains however elusive whether baclofen can modulate reinforcement learning in humans. Here, in a double-blind study in 34 healthy human volunteers, we tested the effects of a low and a high concentration of oral baclofen, a high affinity GABAB-receptor agonist, in a gambling task associated with monetary reward. A low (20 mg) dose of baclofen increased the efficiency of reward-associated learning but had no effect on the avoidance of monetary loss. A high (50 mg) dose of baclofen on the other hand did not affect the learning curve. At the end of the task, subjects who received 20 mg baclofen p.o. were more accurate in choosing the symbol linked to the highest probability of earning money compared to the control group (89.55 ± 1.39 vs. 81.07 ± 1.55%, p = 0.002). Our results support a model where baclofen, at low concentrations, causes a disinhibition of DA neurons, increases DA levels and thus facilitates reinforcement learning. Keywords: instrumental learning, mesolimbic dopamine system, reward-prediction error, baclofen, bi-directional effect, ventral tegmental area, anti-craving treatment Introduction In his paper on “The Law of Effect,” Thorndike stipulated that: “of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur” (Thorndike, 1898). Since then, it has been suggested that the mesolimbic dopamine (DA) system is involved in this learning by coding for a “reward-prediction error” (Schultz et al., 1997). The mesocorticolimbic DA system originates in the ventral tegmental area (VTA), which projects to the nucleus accumbens (NAc) and the prefrontal cortex. Under physiologic conditions, mesocorticolimbic projections release DA in response to natural rewards such as food and sex, which are critical for the survival of the species. This process reflects the fact that it is important for an organism to learn the circumstances under which rewards are obtained (Balland and Lüscher, 2008). When an external reward is delivered, DA neurons elicit a strong learning signal indicating whether the value of the current state is better or worse than predicted (Schultz et al., 1997), rather than euphoria or pleasure (Balland and Lüscher, 2008). This signal therefore allows rapid acquisition of predictive cues and efficient behaviors that are successful in obtaining rewards (Bechara et al., 1998). Evidence that this system can be pharmacologically modulated by changes in DA function has been provided by Pessiglione et al. (2006). In their study, human volunteers carried out a learning task that involved money gains and losses while functional magnetic resonance images (fMRI) were collected. When mesocorticolimbic DA was boosted by l-DOPA, the participants learned faster and earned more money. Conversely, when DA signaling was inhibited Frontiers in Behavioral Neuroscience by haloperidol, participants learned slower and earned less money compared to the control group. Interestingly, no shift of the learning curves was observed when participants were in the loss condition, which suggests that other processes are involved in aversive learning. In a separate study using the Iowa gambling task, an activation of the ventral striatum has also been shown by fMRI (Li et al., 2010). The effect of DA on learning can be explained by a modulation of the mesocorticolimbic system of circuits involved in action planning and decision-making. In many mammals, at least two systems exist to predict the value of an action: the planning or explicit system, which takes a given situation, predicts an outcome and evaluates that outcome; and the habit or implicit system, which takes a given situation and identifies the best remembered action to take (Redish et al., 2008). The flexible planning system involves the ventral and dorsomedial striatum, the prelimbic medial prefrontal cortex and the orbitofrontal cortex, as well as the entorhinal cortex and hippocampus, with an involvement of DA inputs from the VTA. The habit system involves the dorsolateral striatum, the infralimbic medial prefrontal cortex as well as the parietal cortex, with an involvement of DA inputs from the pars compacta of the substantia nigra (SNc; Redish et al., 2008). The mesocorticolimbic system thus has a central role in evaluating the value of predicted outcomes during decision-making and planning. An over-evaluation of a predicted value by the DA system might alter the decision system leading to addictive behaviors (Redish et al., 2008). Another mechanism leading to automatic decision-making and even addiction could be the recruitment of the habit system by the NAc via feedback loops to the dorsal striatum (Koob and Volkow, 2010). Understanding how modulation of DA can alter valuation and decision-processing therefore has profound implication for understanding motivated www.frontiersin.org July 2011 | Volume 5 | Article 40 | 1 Terrier et al. Baclofen modulates reinforcement learning behaviors and addiction. Here we propose to pharmacologically modulate DA release with the GABAB-receptor agonist baclofen and observe the effect of this modulation on an instrumental learning task. Baclofen (p-chlorophenyl-GABA) acts as a high affinity g-aminobutyric acid type B (GABAB) receptor agonist. Its primary action as spasmolytic agent is mediated by increasing K+ conductance that results in postsynaptic inhibition (Cruz et al., 2004; Katzung, 2009). In addition, baclofen causes presynaptic inhibition by reducing Ca2+ influx and the release probability of excitatory transmitters in the brain and spinal cord (Katzung, 2009). Interestingly, baclofen may also modu (...truncated)