Self-supervised predictive learning accounts for cortical layer-specificity (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41467-025-61399-5.pdf

Self-supervised predictive learning accounts for cortical layer-specificity

Article https://doi.org/10.1038/s41467-025-61399-5 Self-supervised predictive learning accounts for cortical layer-speciﬁcity Received: 21 May 2024 Accepted: 19 June 2025 1234567890():,; 1234567890():,; Check for updates Kevin Kermani Nejad1,2, Paul Anastasiades Rui Ponte Costa 1,2,5 3 , Loreen Hertäg 4,5 & The neocortex constructs an internal representation of the world, but the underlying circuitry and computational principles remain unclear. Inspired by self-supervised learning algorithms, we propose a computational theory in which layer 2/3 (L2/3) integrates past sensory input, relayed via layer 4, with top-down context to predict incoming sensory stimuli. Learning is selfsupervised by comparing L2/3 predictions with the latent representations of actual sensory input arriving at L5. We demonstrate that our model accurately predicts sensory information in context-dependent temporal tasks, and that its predictions are robust to noisy and occluded sensory input. Additionally, our model generates layer-speciﬁc sparsity, consistent with experimental observations. Next, using a sensorimotor task, we show that the model’s L2/3 and L5 prediction errors mirror mismatch responses observed in awake, behaving mice. Finally, through manipulations, we offer testable predictions to unveil the computational roles of various cortical features. In summary, our ﬁndings suggest that the multi-layered neocortex empowers the brain with self-supervised predictive learning. Internal models of the external world are believed to endow the brain with the ability to predict incoming sensory information and select appropriate action-outcome contingencies1. Internal models are widely believed to be encoded in the neocortex2,3, whose hallmark feature is its laminar organization, comprising six distinct layers. Although much has been learned about the underlying cellular heterogeneity and connectivity of individual cortical layers, why the neocortex relies on a multi-layered structure remains unclear4. Unraveling its function could shed light on the neocortical algorithms responsible for building rich internal representations of the world. Historically, it has been proposed that unsupervised learning in sensory cortices underpins the development of intricate sensory representations that are critical for driving behavior5–7. Self-supervised learning is a form of unsupervised learning that leverages the inherent structure or patterns within the data as the target for learning. A common application of self-supervised learning is to predict the incoming input given past information8–12. Importantly, self-supervised learning algorithms learn representations that capture experimentally observed latent representations while resulting in richer models of input statistics12–16. However, learning in these models is often treated as a black box; therefore, it remains to be determined whether the brain is capable of employing such learning principles. The traditional view of the neocortical microcircuit postulates a sequential ﬂow of sensory information. In this canonical view, sensory input is relayed via the thalamus to layer 4 (L4) of the neocortex17,18. L4 subsequently forwards this information to layer 2/3 (L2/3), which is thought to integrate ascending sensory information with top-down modulatory input from higher-order cortical areas19–21. L2/3 in turn projects to layer 5 (L5), which transmits the information to other brain areas (Fig. 1a). However, growing evidence suggests that this model 1 Centre for Neural Circuits and Behaviour, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom. 2Bristol Computational Neuroscience Unit, Intelligent Systems Lab, Faculty of Engineering, University of Bristol, Bristol BS8 1TH, United Kingdom. 3Department of Translational Health Sciences, University of Bristol, Whitson Street, Bristol BS1 3NY, United Kingdom. 4Technische Universität Berlin & Bernstein Center for Computational Neuroscience Berlin, 10115 Berlin, Germany. 5These authors jointly supervised this work: Loreen Hertäg and Rui Ponte Costa. e-mail: Nature Communications | (2025)16:6178 1 Article https://doi.org/10.1038/s41467-025-61399-5 t-1 Top-down Canonical view Updated view t t+1 top-down top-down Time L2/3 top-down L2/3 L2/3 L4 L4 L4 L5 L5 L5 L6 Sensory Thal Thal Fig. 1 | Information ﬂow in neocortical circuits. a The canonical and updated view of the neocortical microcircuit. Sensory input is initially processed by the thalamus, which, in the classical view, exclusively targets layer 4 (L4). L4 subsequently relays this information to layer 2/3 (L2/3). L2/3, in turn, combines L4 input with top-down contextual input that is fed forward to layer 5 (L5). However, recent studies have emphasized the need to update this view due to direct projections from sensory thalamic nuclei to L5 pyramidal cells26 (green arrow). For the sake of clarity, we omitted feedback connections from the schematic, which in our self-supervised model are responsible for carrying error signals that drive learning (see main text and Methods). b Onset latencies of postsynaptic potentials (PSP) by cortical depth. An onset latency of 0 ms denotes the timing of sensory input (whisker deﬂection). These results demonstrate the simultaneous activation of L4 and L5 neurons by the thalamus (blue bands), indicating a direct thalamic input to L5, and a delayed activation of L2/3 neurons (orange band). c Illustration of information ﬂow of the proposed self-supervised temporal learning in the neocortical microcircuit. L2/3, informed by past sensory input from L4 and top-down contextual input, predicts the current sensory input arriving in L5. The direct thalamic inputs to L5 provide sensory input, which is used as a teaching signal to instruct the L2/3 predictive model. Gabor-like gratings represent neuronal encoding of the sensory input, or its prediction in the case of L2/3-to-L5 connections. Panel b adapted from Christine M Constantinople and Randy M Bruno. Deep cortical layers are activated directly by the thalamus. Science 340 (2013); reprinted with permission from AAAS. does not capture the full diversity of connections in the neocortical microcircuit18. A body of experimental works suggests that L5 pyramidal cells receive direct thalamic input that can drive short-latency, sensoryevoked responses independently of activity within the cortical network (Fig. 1b)22–26. These observations imply two distinct sensory-driven pathways within the neocortex, one targeting L4 and the other L5 (Fig. 1a). However, why the cortex requires multiple inputs and the computations supported by such parallel pathways remain unknown. Inspired by this refreshed view of the canonical microcircuit and the predictive capabilities of self-supervised machine learning algorithms9,27, we propose a model in which L2/3, informed by past sensory input from L4 and top-down context from higher- (...truncated)