Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration

Jan 2022

Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro-adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation, and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s42003-021-02810-x.pdf

Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration

ARTICLE https://doi.org/10.1038/s42003-021-02810-x OPEN Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration 1234567890():,; David W. McKellar1, Lauren D. Walter2, Leo T. Song 1, Madhav Mantri Iwijn De Vlaminck 1,4 ✉ & Benjamin D. Cosgrove 1,4 ✉ 3, Michael F. Z. Wang3, Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro-adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation, and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligandreceptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery. 1 Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14853, USA. 2 Department of Molecular Biology & Genetics, Cornell University, Ithaca, NY 14853, USA. 3 Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA. 4These authors contributed equally: Iwijn De Vlaminck, Benjamin D. Cosgrove. ✉email: ; COMMUNICATIONS BIOLOGY | (2021)4:1280 | https://doi.org/10.1038/s42003-021-02810-x | www.nature.com/commsbio 1 ARTICLE COMMUNICATIONS BIOLOGY | https://doi.org/10.1038/s42003-021-02810-x M uscle stem cells (MuSCs) are essential for muscle homeostasis and repair. MuSCs are typically quiescent in homeostasis and are activated after muscle damage. Their subsequent proliferation, differentiation, commitment, and fusion replenishes skeletal muscle tissue in a complex, coordinated process1–3. MuSCs are a rare cell type, accounting for less than 1% of the cells within skeletal muscle at homeostasis. Even rarer are the cell states quiescent MuSCs transition through during differentiation to myofiber cells. Consequently, MuSCs and muscle progenitor cells (myoblasts and myocytes) are difficult to study in their native tissue context. Conventional strategies to study MuSCs and muscle progenitor cells rely on enrichment by fluorescence-activated cell sorting using a transgenic reporter or prospective isolation markers4. These methods however are illsuited to capture the subtle, continuous cell state transitions which are critical for myogenesis due to a paucity of highly stagespecific cell isolation markers and the rarity of these cells. Single-cell RNA sequencing (scRNAseq) enables a detailed characterization of cell types and states in complex tissues without the need for targeted cell enrichment5–8. Skeletal muscle has been the focus of a number of recent scRNAseq studies, which have aimed to catalog its dynamic and heterogeneous constituent cell types and the progression of myogenic stem and progenitor cell regulation in muscle development and repair7. Single-nucleus RNA sequencing (snRNAseq) has been used to capture transcriptomic signatures from mature myofiber nuclei, which are largely lost during cell isolation required for scRNAseq9–13. Yet, despite advances in the scale of sc/snRNAseq technologies (103–104 cells per experiment), these methods still poorly sample rare cell types and transient cell states in detail without purification, which can introduce marker bias and technical artefacts14. For example, we previously used scRNAseq to study the dynamics of hindlimb skeletal muscle regeneration in adult mice and resolved ~12 muscle-resident cell types from ~35,000 single-cell transcriptomes15. However, we observed fewer than 100 committed and fusing myogenic cells even though we sampled key time-points of myogenic differentiation post-injury15. Other studies similarly reported an infrequent sampling of committed myogenic progenitors from whole muscle samples15–17. To overcome these challenges, we used large-scale integration of single-cell transcriptomics data. We measured ~95,000 single-cell transcriptomes from 23 new samples of regenerating mouse hindlimb muscles in older mice. We then leveraged recent improvements in batch-correction algorithms18,19 to incorporate 88 publicly available sc/snRNAseq datasets from 18 prior studies in our analysis9,11,15–17,20–32. This led to a dataset that included ~365,000 cells/nuclei after quality filtering and allowed us to study the cellular composition and dynamics in response to skeletal muscle injury over a wide range of experimental conditions. The depth of transcriptome coverage achieved by large-scale integration of single-cell transcriptomic data enabled us to robustly characterize rare, short-lived cell states on the myogenic cell differentiation trajectory. We identified transcription factors and surface markers that distinguish committed myoblasts (~5 per sample, on average) and fusing myocytes (~15 per sample, on average), which represent only 0.2 and 0.5% of all cells in the integrated muscle compendium, respectively. We performed spatial RNA sequencing of mouse muscle at three-time points after injury and used the integrated compendium as a reference to achieve a high-resolution, local deconvolution of cell subtypes. Our analysis brings insights into the dynamics of stromal and immune cell colocalization with transient myogenic cell states. Results Large-scale integration enables a high-resolution view of skeletal muscle. To profile skelet (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s42003-021-02810-x.pdf
Article home page: https://www.nature.com/articles/s42003-021-02810-x

McKellar, David W., Walter, Lauren D., Song, Leo T., Mantri, Madhav, Wang, Michael F. Z., De Vlaminck, Iwijn, Cosgrove, Benjamin D.. Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration, DOI: 10.1038/s42003-021-02810-x