Self-supervised predictive learning accounts for cortical layer-specificity
Article
https://doi.org/10.1038/s41467-025-61399-5
Self-supervised predictive learning accounts
for cortical layer-specificity
Received: 21 May 2024
Accepted: 19 June 2025
1234567890():,;
1234567890():,;
Check for updates
Kevin Kermani Nejad1,2, Paul Anastasiades
Rui Ponte Costa 1,2,5
3
, Loreen Hertäg
4,5
&
The neocortex constructs an internal representation of the world, but the
underlying circuitry and computational principles remain unclear. Inspired by
self-supervised learning algorithms, we propose a computational theory in
which layer 2/3 (L2/3) integrates past sensory input, relayed via layer 4, with
top-down context to predict incoming sensory stimuli. Learning is selfsupervised by comparing L2/3 predictions with the latent representations of
actual sensory input arriving at L5. We demonstrate that our model accurately
predicts sensory information in context-dependent temporal tasks, and that
its predictions are robust to noisy and occluded sensory input. Additionally,
our model generates layer-specific sparsity, consistent with experimental
observations. Next, using a sensorimotor task, we show that the model’s L2/3
and L5 prediction errors mirror mismatch responses observed in awake,
behaving mice. Finally, through manipulations, we offer testable predictions to
unveil the computational roles of various cortical features. In summary, our
findings suggest that the multi-layered neocortex empowers the brain with
self-supervised predictive learning.
Internal models of the external world are believed to endow the brain
with the ability to predict incoming sensory information and select
appropriate action-outcome contingencies1. Internal models are
widely believed to be encoded in the neocortex2,3, whose hallmark
feature is its laminar organization, comprising six distinct layers.
Although much has been learned about the underlying cellular heterogeneity and connectivity of individual cortical layers, why the
neocortex relies on a multi-layered structure remains unclear4. Unraveling its function could shed light on the neocortical algorithms
responsible for building rich internal representations of the world.
Historically, it has been proposed that unsupervised learning in
sensory cortices underpins the development of intricate sensory
representations that are critical for driving behavior5–7. Self-supervised
learning is a form of unsupervised learning that leverages the inherent
structure or patterns within the data as the target for learning. A
common application of self-supervised learning is to predict the
incoming input given past information8–12. Importantly, self-supervised
learning algorithms learn representations that capture experimentally
observed latent representations while resulting in richer models of
input statistics12–16. However, learning in these models is often treated
as a black box; therefore, it remains to be determined whether the
brain is capable of employing such learning principles.
The traditional view of the neocortical microcircuit postulates a
sequential flow of sensory information. In this canonical view, sensory
input is relayed via the thalamus to layer 4 (L4) of the neocortex17,18. L4
subsequently forwards this information to layer 2/3 (L2/3), which is
thought to integrate ascending sensory information with top-down
modulatory input from higher-order cortical areas19–21. L2/3 in turn
projects to layer 5 (L5), which transmits the information to other brain
areas (Fig. 1a). However, growing evidence suggests that this model
1
Centre for Neural Circuits and Behaviour, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom. 2Bristol Computational Neuroscience Unit, Intelligent Systems Lab, Faculty of Engineering, University of Bristol, Bristol BS8 1TH, United Kingdom. 3Department of
Translational Health Sciences, University of Bristol, Whitson Street, Bristol BS1 3NY, United Kingdom. 4Technische Universität Berlin & Bernstein Center for
Computational Neuroscience Berlin, 10115 Berlin, Germany. 5These authors jointly supervised this work: Loreen Hertäg and Rui Ponte Costa.
e-mail:
Nature Communications | (2025)16:6178
1
Article
https://doi.org/10.1038/s41467-025-61399-5
t-1
Top-down
Canonical view
Updated view
t
t+1
top-down
top-down
Time
L2/3
top-down
L2/3
L2/3
L4
L4
L4
L5
L5
L5
L6
Sensory
Thal
Thal
Fig. 1 | Information flow in neocortical circuits. a The canonical and updated view
of the neocortical microcircuit. Sensory input is initially processed by the thalamus,
which, in the classical view, exclusively targets layer 4 (L4). L4 subsequently relays
this information to layer 2/3 (L2/3). L2/3, in turn, combines L4 input with top-down
contextual input that is fed forward to layer 5 (L5). However, recent studies have
emphasized the need to update this view due to direct projections from sensory
thalamic nuclei to L5 pyramidal cells26 (green arrow). For the sake of clarity, we
omitted feedback connections from the schematic, which in our self-supervised
model are responsible for carrying error signals that drive learning (see main text
and Methods). b Onset latencies of postsynaptic potentials (PSP) by cortical depth.
An onset latency of 0 ms denotes the timing of sensory input (whisker deflection).
These results demonstrate the simultaneous activation of L4 and L5 neurons by the
thalamus (blue bands), indicating a direct thalamic input to L5, and a delayed
activation of L2/3 neurons (orange band). c Illustration of information flow of the
proposed self-supervised temporal learning in the neocortical microcircuit. L2/3,
informed by past sensory input from L4 and top-down contextual input, predicts
the current sensory input arriving in L5. The direct thalamic inputs to L5 provide
sensory input, which is used as a teaching signal to instruct the L2/3 predictive
model. Gabor-like gratings represent neuronal encoding of the sensory input, or its
prediction in the case of L2/3-to-L5 connections. Panel b adapted from Christine M
Constantinople and Randy M Bruno. Deep cortical layers are activated directly by the
thalamus. Science 340 (2013); reprinted with permission from AAAS.
does not capture the full diversity of connections in the neocortical
microcircuit18. A body of experimental works suggests that L5 pyramidal
cells receive direct thalamic input that can drive short-latency, sensoryevoked responses independently of activity within the cortical network
(Fig. 1b)22–26. These observations imply two distinct sensory-driven
pathways within the neocortex, one targeting L4 and the other L5
(Fig. 1a). However, why the cortex requires multiple inputs and the
computations supported by such parallel pathways remain unknown.
Inspired by this refreshed view of the canonical microcircuit and
the predictive capabilities of self-supervised machine learning
algorithms9,27, we propose a model in which L2/3, informed by past
sensory input from L4 and top-down context from higher- (...truncated)