Nature Methods

List of Papers (Total 1,381)

Unraveling lncRNA diversity at a single cell resolution and in a spatial context across different cancer types

Long noncoding RNAs (lncRNAs) participate in gene regulation underlying development and disease. Overcoming inherent limitations of bulk sequencing lncRNA analysis, we leveraged single-cell and spatial transcriptomics (ST) data to analyze 219,442 potential lncRNAs identified by the TAR-scRNA-seq pipeline across 13 cancer types. The lncRNA functions were assessed by identifying...

AlphaFold as a prior: experimental structure determination conditioned on a pretrained neural network

Advances in machine learning have transformed structural biology, enabling swift and accurate prediction of protein structure from sequence. However, key challenges persist in modeling side-chain packing, condition-dependent conformational changes and biomolecular interactions, largely because of limited high-quality training data. At the same time, emerging experimental...

Quantifying uncertainty in protein representations across models and tasks

Biomolecular embeddings serve as efficient representations of sequence and structure, enabling tasks such as similarity searches, structure and function prediction and estimation of biophysical properties. However, relying on embeddings without assessing their ability to accurately represent biomolecules is a critical flaw—akin to using a scalpel in surgery without verifying its...

CREsted: modeling genomic and synthetic cell-type-specific enhancers across tissues and species

Sequence-based deep learning models have become the state of the art for analyzing the genomic regulatory code. Particularly for enhancers, these models excel at deciphering sequence grammar that underlies their activity. To enable end-to-end enhancer modeling and design, we developed a software package called CREsted (cis-regulatory element sequence training, explanation and...

A series of spontaneously blinking dyes for super-resolution microscopy

Spontaneously blinking fluorophores toggle between nonfluorescent and fluorescent forms without caging groups or redox buffers, enabling super-resolution imaging. The intrinsic blinking of such dyes is governed by molecular structure and modulated by environment; there is no one-size-fits-all fluorophore suitable for every imaging context. We report dyes with tuned on:off ratios...

PinkyCaMP: an mScarlet-based calcium sensor with enhanced brightness, photostability and multiplexing capabilities

Genetically encoded calcium (Ca2+) indicators (GECIs) are essential tools for monitoring neuronal activity, but the performance of red fluorescent GECIs has remained limited. In particular, many red indicators are relatively dim, produce low signal-to-noise ratios and can undergo unwanted photoswitching when exposed to blue light, restricting their use in all-optical experiments...

Direct RNA sequencing and signal alignment reveal RNA structure ensembles in a eukaryotic cell

The extent to which an RNA folds into structure ensembles and how different structures in the ensemble regulate eukaryotic gene expression is not fully understood. Here, we coupled chemical probing with direct RNA sequencing to identify structure modifications along a single RNA molecule (sm-PORE-cupine). We used direct signal alignment in addition to base mapping to increase the...

eSIG-Net: an interaction language model that decodes the protein code of single mutations

Most proteins act through interactions with other molecules, yet predicting how single mutations perturb these interactions—defined as ‘protein codes’—remains a central challenge in computational biology. Here we introduce eSIG-Net, the edgetic mutation sequence-based interaction grammar network, a language model that integrates protein sequence embeddings with syntax-aware and...

Resolving sensitivity, specificity and signal contamination in Xenium spatial transcriptomics

Spatial transcriptomics enables high-resolution gene expression mapping in intact tissues. Xenium is widely adopted for its reliability, accessibility and data quality, yet the properties and limitations of Xenium-derived data remain poorly characterized. Here we present one of the most comprehensive Xenium datasets so far, encompassing over 40 breast and lung tumor sections...

Differentiation of sphingomyelin and cholesterol by hyperspectral mid-infrared detection of single-bond vibrational modes in the fingerprint region

Lipids play a central role in a multitude of biological functions associated with cancer, obesity, diabetes, cardiovascular and neurological pathologies. However, sensing and mapping of lipid classes in living cells remains challenging. Here we introduce a label-free approach to lipid imaging, which differentiates lipid species in living cells by hyperspectral mid-infrared...

Compressing the collective knowledge of ESM into a single protein language model

Protein language models (PLMs) have recently emerged as a promising approach for next-generation variant-effect prediction (VEP). Most high-performing VEP methods currently utilize PLMs combined with additional information, such as homology, protein structure and population genetics data to improve prediction accuracy. This performance gain, however, comes with added complexity...

Tunable hydrogel-based micropillar arrays for myelination studies

Oligodendrocytes enable rapid central nervous system signaling by myelinating axons. Here, to model key biomechanical cues regulating myelination, we developed a tunable hydrogel-based micropillar array system that mimics the three-dimensional architecture and softness of axons. This platform supports the long-term culture of oligodendrocytes and robust formation of multilayered...

3d-OT: a deep geometry-aware framework for heterogeneous slices alignment of spatial multi-omics

The rapid advancement of spatial multi-omics technologies has unveiled opportunities for deciphering the intricate spatial heterogeneity; however, current computational approaches struggle to comprehensively integrate diverse molecular and spatial information. Here we propose 3d-OT, a deep geometry-aware framework that leverages spatial geometric and multi-omics information for...

Clustering the protein universe of life using DIAMOND DeepClust

Relating billions of proteins across the tree of life remains a challenging task for comparative biosphere genomics and artificial intelligence-driven structure prediction. Here we present DIAMOND DeepClust, a cascaded, ultra-fast clustering method enabling planetary-scale organization of protein space, scaling to trillions of sequences while retaining sensitivity at low identity...

Integration of alternative fragmentation techniques into standard LC-MS workflows using a single deep learning model enhances proteome coverage

Bottom-up proteomics relies predominantly on collision-induced dissociation (CID) for peptide sequencing, which has achieved remarkable sensitivity and efficiency now enabling single-cell analysis. However, CID shows limitations in characterizing post-translational modifications and complex proteoforms. Here we have developed an integrated mass spectrometry platform enabling...

LazySlide: accessible and interoperable whole-slide image analysis

Histopathological data are foundational in both biological research and clinical diagnostics but remain siloed from modern multimodal and single-cell frameworks. Here we introduce LazySlide, an open-source Python package built on the scverse ecosystem for efficient whole-slide image analysis and multimodal integration. By leveraging vision–language foundation models and adhering...

Isotonic and minimally invasive optical clearing media for live cell imaging ex vivo and in vivo

Tissue clearing has been widely used for fluorescence imaging of fixed tissues, but its application to live tissues has been limited by toxicity. Here we develop minimally invasive optical clearing media for fluorescence imaging of live mammalian tissues. Light scattering is minimized by adding spherical polymers with low osmolarity to the extracellular medium. A clearing medium...

AF2BIND: predicting small-molecule binding sites using the pair representation of AlphaFold2

Identification of small-molecule binding sites in proteins is an important task for drug discovery. Despite previous homology- and machine-learning-based approaches to this problem, true de novo binding-site prediction remains a challenge. Here we use features from a pretrained neural network to train a logistic regression model, AF2BIND, for accurate prediction of de novo...

Scaffolds with optimized quaternary symmetry for de novo cryoEM structure determination of small RNAs

Structured RNAs play many roles in cells and emerging biotechnology. While large RNAs and ribonucleoprotein complexes often benefit from high-resolution structural analysis through cryogenic-sample electron microscopy (cryoEM), single-domain RNAs, particularly those smaller than ~100 nt (33 kDa), have proven challenging. Here we address this methodological gap by engineering two...

High-throughput phenomics of global ant biodiversity

The big data era in biology is underway, but the study of organismal form has been slow to capitalize on advances in imaging and computation. Imaging approaches can digitize whole organisms, but low throughput has limited the effort to document morphological diversity. Here, within the open science initiative ‘Antscan’, we applied high-throughput synchrotron X-ray microtomography...

DECODE: deep learning-based common deconvolution framework for various omics data

Deconvolution algorithms estimate cell-type abundances from tissue-level data, enabling systematic cellular analysis of large cohorts. However, most deconvolution algorithms are specifically designed for single-omics data, thereby limiting their generalizability and scalability for various omics data from different cohorts. Here we present DECODE, a universal deconvolution...