Abstract representations emerge naturally in neural networks trained to perform multiple tasks
Article
https://doi.org/10.1038/s41467-023-36583-0
Abstract representations emerge naturally
in neural networks trained to perform
multiple tasks
Received: 14 June 2022
W. Jeffrey Johnston
1,2
& Stefano Fusi
1,2
Check for updates
1234567890():,;
1234567890():,;
Accepted: 7 February 2023
Humans and other animals demonstrate a remarkable ability to generalize
knowledge across distinct contexts and objects during natural behavior. We
posit that this ability to generalize arises from a specific representational
geometry, that we call abstract and that is referred to as disentangled in
machine learning. These abstract representations have been observed in
recent neurophysiological studies. However, it is unknown how they emerge.
Here, using feedforward neural networks, we demonstrate that the learning of
multiple tasks causes abstract representations to emerge, using both supervised and reinforcement learning. We show that these abstract representations
enable few-sample learning and reliable generalization on novel tasks. We
conclude that abstract representations of sensory and cognitive variables may
emerge from the multiple behaviors that animals exhibit in the natural world,
and, as a consequence, could be pervasive in high-level brain regions. We also
make several specific predictions about which variables will be represented
abstractly.
The ability to generalize existing knowledge to novel stimuli or situations is essential to complex, rapid, and accurate behavior. As an
example, when shopping for produce, humans make many different
decisions about whether or not different pieces of produce are ripe—
and, consequently, whether to purchase them. The knowledge we use
in the store is often learned from experience with that fruit at home—
thus, generalizing across distinct contexts. Further, the knowledge
that we apply to a fruit that we buy for the first time might be derived
from similar fruits—generalizing, for instance, from an apple to a pear.
The determinations themselves are often multi-dimensional and multisensory: both firmness and appearance are important for deciding
whether an avocado is the right level of ripeness. Yet, at the end of this
complex process, we make a binary decision about each piece of fruit:
we add it to our cart, or do not—and get feedback later about whether
that was the right decision. This produce shopping example is not
unique. Humans and other animals exhibit an impressive ability to
generalize across contexts and between different objects in many
situations.
The representational geometry of sensory and cognitive variables
in a population of neurons provides insight into the computations that
the representation may and may not facilitate1–3. We hypothesize that
the ability to generalize described above is tied to this representational
geometry. For instance, neural representations of sensory and cognitive variables are often nonlinearly mixed together. As a result, these
representations have high-embedding dimension4–6. While this kind of
nonlinear dimensionality expansion allows flexible learning of new
behaviors5 and provides metabolically efficient and reliable
representations7, the resulting representation often does not permit
generalization across contexts or stimuli5,8. Alternatively, factorized, or
even linear, representations of the relevant sensory or cognitive variables (i.e., representations that have no nonlinear mixing) often permit
this generalization. Recent experimental work has shown that this kind
of factorized—and approximately linear—representation exists at the
apex of the primate ventral visual stream, for faces in inferotemporal
cortex9–11. Further, experimental work in the hippocampus and prefrontal cortex has shown that representations of the sensory and
1
Center for Theoretical Neuroscience, Columbia University, New York, NY, USA. 2Mortimer B. Zuckerman Mind, Brain and Behavior Institute, Columbia
e-mail: ;
University, New York, NY, USA.
Nature Communications | (2023)14:1040
1
Article
https://doi.org/10.1038/s41467-023-36583-0
cognitive features related to a complex cognitive task, also support
generalization8. We refer to representations of task-relevant sensory
and cognitive variables that support generalization—like in these
examples and others12–16—as abstract representations.
In the machine learning literature, abstract representations are
often referred to as factorized17 or disentangled10,17–20 representations
of interpretable stimulus features. Deep learning has been used to
produce abstract representations primarily in the form of unsupervised generative models18,21,22 (but see ref. 23). In this context,
abstract representations are desirable because they allow potentially
novel examples of existing stimulus classes to be produced by linear
interpolation in the abstract representation space (for example,
starting at a known exemplar and changing its orientation by moving
linearly along a dimension in the abstract representation space that is
known to correspond to orientation)18.
Here, we ask how abstract representations—like those observed in
higher brain regions8,9—can be constructed from the nonlinear and
high-dimensional representations observed in early sensory
areas6,24–28. To study this, we begin by mirroring these highdimensional and nonlinear representations in a learned model of
continuous latent variables; then, we show that training feedforward
neural network models to perform multiple distinct classification tasks
on these latent variables induces abstract representations in a wide
variety of conditions.
Experimental work on animals performing more than a couple of
distinct behavioral tasks remains nearly nonexistent29. However,
modeling work using recurrent neural networks has shown that the
networks often develop representations that can be reused across
distinct, but related tasks30–32—though the abstractness of these
reusable representations was not measured. Thus, the behavioral
constraint of multi-tasking may encourage the learning of abstract
representations of stimulus features that are relevant to multiple tasks.
To investigate this hypothesis, we train feedforward neural network
models to perform multiple distinct tasks on a common stimulus
space. Previous work in machine learning has shown that similar multi-
tasking networks can achieve lower loss from the same number of
samples than networks trained independently on each task33 (and see
ref. 34), and that they can quickly learn novel, but related, tasks that are
introduced after training35. Both of these properties are hallmarks of
abstract representations—however, to our knowledge, the representational geometry developed by these multi-tasking networks has
not been characterized.
We begin by introducing the multi-tasking model and show that it
produces fully abstract representations that are surprisingly robust to
heterogeneity and context dependence in the learned tasks. These
representations also emerge in (...truncated)