An evolutionary computational theory of prefrontal executive function in decision-making
Etienne Koechlin
Articles on similar topics can be found in the following collections behaviour (554 articles) cognition (377 articles) computational biology (64 articles) neuroscience (542 articles) Receive free email alerts when new articles cite this article - sign up in the box at the top right-hand corner of the article or click here
-
Subject collections
Email alerting service
rstb.royalsocietypublishing.org
Research
Cite this article: Koechlin E. 2014
An evolutionary computational theory
of prefrontal executive function in
decision-making. Phil. Trans. R. Soc. B 369:
20130474.
http://dx.doi.org/10.1098/rstb.2013.0474
One contribution of 18 to a Theme Issue
The principles of goal-directed
decisionmaking: from neural mechanisms to
computation and robotics.
Author for correspondence:
Etienne Koechlin
e-mail:
An evolutionary computational theory
of prefrontal executive function in
decision-making
Etienne Koechlin
Institut National de la Sante et de la Recherche Medicale, Universite Pierre et Marie Curie,
Ecole Normale Superieure, 29 rue dUlm, 75005 Paris, France
The prefrontal cortex subserves executive control and decision-making, that is,
the coordination and selection of thoughts and actions in the service of adaptive
behaviour. We present here a computational theory describing the evolution
of the prefrontal cortex from rodents to humans as gradually adding new
inferential Bayesian capabilities for dealing with a computationally intractable
decision problem: exploring and learning new behavioural strategies versus
exploiting and adjusting previously learned ones through reinforcement
learning (RL). We provide a principled account identifying three inferential
steps optimizing this arbitration through the emergence of (i) factual reactive
inferences in paralimbic prefrontal regions in rodents; (ii) factual proactive
inferences in lateral prefrontal regions in primates and (iii) counterfactual
reactive and proactive inferences in human frontopolar regions. The theory clarifies
the integration of model-free and model-based RL through the notion of
strategy creation. The theory also shows that counterfactual inferences in humans
yield to the notion of hypothesis testing, a critical reasoning ability for
approximating optimal adaptive processes and presumably endowing humans with a
qualitative evolutionary advantage in adaptive behaviour.
1. Introduction
The prefrontal cortex subserves executive control and decision-making for
coordinating and selecting thoughts and actions in the service of adaptive behaviour.
Present in all mammals [1], the prefrontal cortex in rodents mainly reduces to
paralimbic brain regions including the orbitofrontal cortex (OFC) and
anteriorcingulate cortex (ACC) [1]. In primates, the prefrontal cortex has evolved with
the development of lateral prefrontal regions (LPC) [2]. In humans, the LPC has
further evolved with the emergence of the left right asymmetry yielding to the
notion of Brocas area [3,4] subserving human language [5] and bilaterally, in
its most anterior portion, a polar region [6,7] (lateral frontopolar cortex, lFPC)
which apparently has no homologues in monkeys [8,9] and subserves human
reasoning [10].
The prefrontal cortex forms loop circuits with basal ganglia. These
subcortical brain nuclei are common to vertebrates and include especially the striatum,
which subserves reinforcement learning (RL) [11 14]. RL and, more
specifically, temporal-difference RL algorithms are basic online adaptive processes
that adjust a behavioural strategy mapping stimuli onto actions according to
the discrepancy between actual and expected rewards. Importantly, RL is
both a very simple and robust adaptive process that can learn a variety of
complex tasks even in uncertain environments. In particular, when rewards only
depend upon current states and actions and each state is encountered
sufficiently often, RL converges towards the behavioural strategy maximizing
rewards [15]. Evidence in rodents, primates and humans indicates that the
ventral striatum processes reinforcing signals such as reward prediction errors that
serve to adjust stimulus response associations, whereas the dorsal striatum in
& 2014 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution
License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original
author and source are credited.
t =0
t = 0
l0 =
lt =
P0 ~ SPi Pi(o0|s0,a0)
P0(o0|s0,a0)
P0(o0|s0,a0) + G0
lt 1 Pt(ot|st,at)
G0 = 1
lt 1 Pt(ot|st,at) + (1lt 1)Gt
Gt = equiprob. of actor outcomes
Qt(st,at) ~ RL(Qt 1(st,at),ot)
Pt(ot|st,at) ~ outcome freq.
relation to the premotor cortex processes stimulus response
associations guiding action selection [13,16 18].
However, RL has severe adaptive limitations. The most
evident and crucial limitation is that learning new behavioural
strategies erases previously learned ones. Indeed, the ability
to store and re-use p (...truncated)