Functional analysis of structural variants in single cells using Strand-seq

Nature Biotechnology, Jan 2023

Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their functional consequences. We present a computational method, scNOVA, which uses Strand-seq to perform haplotype-aware integration of SV discovery and molecular phenotyping in single cells by using nucleosome occupancy to infer gene expression as a readout. Application to leukemias and cell lines identifies local effects of copy-balanced rearrangements on gene deregulation, and consequences of SVs on aberrant signaling pathways in subclones. We discovered distinct SV subclones with dysregulated Wnt signaling in a chronic lymphocytic leukemia patient. We further uncovered the consequences of subclonal chromothripsis in T cell acute lymphoblastic leukemia, which revealed c-Myb activation, enrichment of a primitive cell state and informed successful targeting of the subclone in cell culture, using a Notch inhibitor. By directly linking SVs to their functional effects, scNOVA enables systematic single-cell multiomic studies of structural variation in heterogeneous cell populations.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41587-022-01551-4.pdf

Functional analysis of structural variants in single cells using Strand-seq

nature biotechnology Article https://doi.org/10.1038/s41587-022-01551-4 Functional analysis of structural variants in single cells using Strand-seq Received: 29 October 2021 Accepted: 7 October 2022 Published online: 24 November 2022 Check for updates Hyobin Jeong 1,18,20, Karen Grimes 1,2,20, Kerstin K. Rauwolf3, Peter-Martin Bruch 4,5,6, Tobias Rausch 1,5, Patrick Hasenfeld 1, Eva Benito1, Tobias Roider 1,4,5, Radhakrishnan Sabarinathan7, David Porubsky 8,9,19, Sophie A. Herbst 4,5, Büşra Erarslan-Uysal5,10, Johann-Christoph Jann 11, Tobias Marschall 12, Daniel Nowak 11, Jean-Pierre Bourquin3, Andreas E. Kulozik 5,10, Sascha Dietrich4,5,6,13, Beat Bornhauser 3, Ashley D. Sanders 1,14,15,16,21 & Jan O. Korbel 1,5,17,21 Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their functional consequences. We present a computational method, scNOVA, which uses Strand-seq to perform haplotype-aware integration of SV discovery and molecular phenotyping in single cells by using nucleosome occupancy to infer gene expression as a readout. Application to leukemias and cell lines identifies local effects of copy-balanced rearrangements on gene deregulation, and consequences of SVs on aberrant signaling pathways in subclones. We discovered distinct SV subclones with dysregulated Wnt signaling in a chronic lymphocytic leukemia patient. We further uncovered the consequences of subclonal chromothripsis in T cell acute lymphoblastic leukemia, which revealed c-Myb activation, enrichment of a primitive cell state and informed successful targeting of the subclone in cell culture, using a Notch inhibitor. By directly linking SVs to their functional effects, scNOVA enables systematic single-cell multiomic studies of structural variation in heterogeneous cell populations. The mutational landscapes of numerous cancers were recently cataloged1,2, revealing that somatic SVs represent around 55% of driver mutations2,3. Somatic mutational processes generate a broad spectrum of SVs from simple (for example, deletions and inversions) to complex classes (for example, chromothripsis)4–8, and these SVs are important drivers of malignancy, metastasis and relapse9–12. However, with the exception of focal deletions and amplifications, somatic SVs have proven difficult to characterize functionally in cancer genomic surveys1–3,13. Studies integrating transcriptome and whole genome sequencing (WGS) data have inferred SV functional outcomes13–16, but these typically require large cohorts and do not account for intratumor heterogeneity (ITH)3. Instead, SV effects can be measured directly by reading both genotype and molecular phenotype in the same cell, A full list of affiliations appears at the end of the paper. using single-cell multiomics17–21. Several such methods have been developed17–20, but these do not presently account for small (<10 Mb) somatic copy number alterations (SCNAs), balanced SVs and complex rearrangement events like chromothripsis4,5,7,22, which has limited efforts to functionally characterize the most common class of driver mutations in cancer. To address this, we developed scNOVA (single-cell nucleosome occupancy and genetic variation analysis)—a method enabling functional characterization of the full spectrum of somatic SV classes. scNOVA uses Strand-seq23 in two ways: (1) it uses the DNA fragmentation pattern resulting from micrococcal nuclease (MNase) digestion23 to directly measure nucleosome occupancy (NO) and indirectly infer patterns of gene activity, and (2) it couples this ‘molecular phenotype’ e-mail: ; Nature Biotechnology | Volume 41 | June 2023 | 832–844 832 Article with SVs discovered by single-cell tri-channel processing (scTRIP, which jointly models read-orientation, read depth and haplotype-phase24) in the same cell. MNase digests the linker DNA between nucleosomes, leaving nucleosome-protected DNA intact, to enable genome-wide inference of NO by measuring sequence read counts25–28. Previous work has shown that active enhancers and transcribed genes exhibit reduced NO25–30. However, the relationships between NO and SV landscapes in cancer remain unexplored. scNOVA addresses this by integrating SVs and NO along the genome of a cell, to functionally characterize SVs in heterogeneous samples. Results NO classifies cell types and predicts gene activity changes Strand-seq data reveals NO. We hypothesized that NO patterns derived from MNase fragmentation during Strand-seq library preparation could represent a readout to allow functional characterization of SVs (Fig. 1a and Extended Data Fig. 1). To test this, we evaluated whether Strand-seq data revealed nucleosome positioning through comparison with bulk MNase-seq data. We used the NA12878 lymphoblastoid cell line (LCL), which has both datatypes available, and pooled 95 Strand-seq libraries (sequenced to a median of 540,379 mapped nonduplicate reads per single cell31; Supplementary Table 1), into a ‘pseudobulk’ track, allowing direct comparison with the corresponding MNase-seq dataset (sequenced to 19-fold genomic coverage32). We measured NO along the genome (Methods) and found Strand-seq and MNase-seq were highly concordant in terms of uniformity of coverage and inferred nucleosome positions at DNase I hypersensitive sites (Spearman’s r = 0.68) (Fig. 1b,c). Nucleosome positioning near the binding site of CTCF26,28 (a key chromatin organizer) closely matched between both assays (Fig. 1d and Supplementary Fig. 1), and estimated nucleosome repeat lengths28 were highly concordant (Supplementary Fig. 1). In addition, both assays measured NO in all 15 chromatin states identified by the Roadmap Epigenome Consortium33. Among these chromatin states, Strand-seq and MNase-seq revealed the highest NO signals on average for the polycomb-repressed state and the bivalent enhancer state, whereas the lowest average NO signals were consistently seen for the active transcription start site (TSS) state (Extended Data Fig. 2). This indicates that Strand-seq enables direct measurement of NO to reveal a ‘molecular readout’. We thus developed the scNOVA framework, which harnesses Strand-seq to measure NO genome-wide and couples this with SVs discovered in the same sequenced cell (Fig. 1a). As Strand-seq resolves its measurements by haplotype31, we considered that haplotype-specific differences in NO (haplotype-specific NO) resulting from random monoallelic expression, germline SNPs and local effects of SVs could be harnessed for scNOVA. To assess the utility of haplotype-resolved NO, we phased 24,652,658 of 49,205,197 (50.1%) of the NA12878 Strand-seq read fragments, and pooled these reads to generate pseudobulk NO tracks for each chromosomal haplotype (denoted ‘H1’ and ‘H2’, respectively; Fig. 1b). Using the female-derived NA12878 cell line, we compared haplotype-specific NO to haplotype-resolved gene expression measureme (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41587-022-01551-4.pdf
Article home page: https://www.nature.com/articles/s41587-022-01551-4

Jeong, Hyobin, Grimes, Karen, Rauwolf, Kerstin K., Bruch, Peter-Martin, Rausch, Tobias, Hasenfeld, Patrick, Benito, Eva, Roider, Tobias, Sabarinathan, Radhakrishnan, Porubsky, David, Herbst, Sophie A., Erarslan-Uysal, Büşra, Jann, Johann-Christoph, Marschall, Tobias, Nowak, Daniel, Bourquin, Jean-Pierre, Kulozik, Andreas E., Dietrich, Sascha, Bornhauser, Beat, Sanders, Ashley D., Korbel, Jan O.. Functional analysis of structural variants in single cells using Strand-seq, Nature Biotechnology, DOI: 10.1038/s41587-022-01551-4