Mapping and phasing of structural variation in patient genomes using nanopore sequencing

Nature Communications, Nov 2017

Despite improvements in genomics technology, the detection of structural variants (SVs) from short-read sequencing still poses challenges, particularly for complex variation. Here we analyse the genomes of two patients with congenital abnormalities using the MinION nanopore sequencer and a novel computational pipeline—NanoSV. We demonstrate that nanopore long reads are superior to short reads with regard to detection of de novo chromothripsis rearrangements. The long reads also enable efficient phasing of genetic variations, which we leveraged to determine the parental origin of all de novo chromothripsis breakpoints and to resolve the structure of these complex rearrangements. Additionally, genome-wide surveillance of inherited SVs reveals novel variants, missed in short-read data sets, a large proportion of which are retrotransposon insertions. We provide a first exploration of patient genome sequencing with a nanopore sequencer and demonstrate the value of long-read sequencing in mapping and phasing of SVs for both clinical and research applications.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41467-017-01343-4.pdf

Mapping and phasing of structural variation in patient genomes using nanopore sequencing

ARTICLE DOI: 10.1038/s41467-017-01343-4 OPEN Mapping and phasing of structural variation in patient genomes using nanopore sequencing 1234567890 Mircea Cretu Stancu1, Markus J. van Roosmalen1, Ivo Renkens1, Marleen M. Nieboer1, Sjors Middelkamp1, Joep de Ligt 1, Giulia Pregno2, Daniela Giachino 2, Giorgia Mandrile2, Jose Espejo Valle-Inclan1, Jerome Korzelius1, Ewart de Bruijn1, Edwin Cuppen3, Michael E. Talkowski4,5,6, Tobias Marschall 7,8, Jeroen de Ridder1 & Wigard P. Kloosterman1 Despite improvements in genomics technology, the detection of structural variants (SVs) from short-read sequencing still poses challenges, particularly for complex variation. Here we analyse the genomes of two patients with congenital abnormalities using the MinION nanopore sequencer and a novel computational pipeline—NanoSV. We demonstrate that nanopore long reads are superior to short reads with regard to detection of de novo chromothripsis rearrangements. The long reads also enable efficient phasing of genetic variations, which we leveraged to determine the parental origin of all de novo chromothripsis breakpoints and to resolve the structure of these complex rearrangements. Additionally, genomewide surveillance of inherited SVs reveals novel variants, missed in short-read data sets, a large proportion of which are retrotransposon insertions. We provide a first exploration of patient genome sequencing with a nanopore sequencer and demonstrate the value of longread sequencing in mapping and phasing of SVs for both clinical and research applications. 1 Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG Utrecht, The Netherlands. 2 Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano 10043, Italy. 3 Department of Genetics and Cancer Genomics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, 3584 CG Utrecht, The Netherlands. 4 Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA. 5 Department of Neurology, Harvard Medical School, Boston, MA 02115, USA. 6 Program in Population and Medical Genetics and Stanley Center for Psychiatric Research, The Broad Institute of M.I.T. and Harvard, Cambridge, MA 02142, USA. 7 Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany. 8 Max Planck Institute for Informatics, 66123 Saarbrücken, Germany. Mircea Cretu Stancu and Markus J. van Roosmalen contributed equally to this work. Correspondence and requests for materials should be addressed to W.P.K. (email: ) NATURE COMMUNICATIONS | 8: 1326 | DOI: 10.1038/s41467-017-01343-4 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-017-01343-4 S econd-generation DNA sequencing has become an essential technology for research and diagnosis of human genetic disease. Sequencing of human exomes has resulted in dramatic increases in novel gene discovery for Mendelian disorders1, while whole-genome sequencing has revealed that a myriad of diseases are caused by genetic changes that can occur both within genes as well as in the noncoding genome2. As a result, genome sequencing has seen rapid adoption in clinical decision making, as the complete picture of a patient’s unique mutation profile enables personalization of treatment strategies3,4. Robust methods to detect structural variants (SVs) in human genomes are essential, as SVs represent an important class of genetic variation that accounts for a far greater number of variable bases than single nucleotide variations (SNVs)5. Moreover, SVs have been implicated in a wide range of genetic disorders6. A particularly revolutionary development in genome sequencing is the use of protein nanopores to measure DNA sequence directly and in real time7. The first successful implementation of this principle in a consumer device was achieved in 2014 by Oxford Nanopore Technologies (ONT) with the introduction of the MinION8. The MinION can sequence stretches of DNA of up to hundreds of kilobases in length, which already resulted in the sequencing of the genomes of several organisms9,10. Because MinION-based sequencing requires almost no capital investment and current devices have a very small footprint, mainstream adoption of these sequencers has the potential to fundamentally change the current paradigm of sequencing in centralized centers. An important and natural application of the long reads produced by nanopore sequencing is identifying SVs. Long-read sequencing is breaking ground for the discovery of SVs at an unprecedented scale and depth11. The first success has been achieved using the Pacific BioSciences SMRT long-read sequencing platform12,13, and alternative methods to capture long-range information have been introduced, such as BioNano optical mapping14 and 10× Genomics linked-read technology15. While short-read next-generation sequencing data rely on multiple (often) indirect sources of information in order to accurately identify SVs, structural changes can be directly reflected in longread data. In this work, we demonstrate sequencing of the whole diploid human genomes of two patients on the MinION sequencer at 11–16× depth of coverage. The two patients suffer from congenital disease resulting from complex chromothripsis. We employ a novel computational pipeline to demonstrate the feasibility of using MinION reads to detect de novo complex SV breakpoints, at high sensitivity. The long reads from the MinION allow efficient phasing of genetic variations (SNVs as well as SVs) and enable us to resolve the long-range structure of the chromothripsis in the patients. Moreover, we identify a significant proportion of SVs that are not detected in short-read Illumina sequencing data of the same patient genomes. Results Sequencing of patient genomes with nanopore MinION. As a first step toward real-time clinical genome sequencing, we evaluated the use of the MinION device to sequence the genomes of two patients with multiple congenital abnormalities16, henceforth denoted as Patient 1 and Patient 2, respectively. We extracted DNA from patient cells and sequenced this on the MinION. For Patient 1, we used R7, R9, and R9.4 pore chemistries (Supplementary Table 1) generating a total of 8.2M template sequencing reads from 122 sequencing runs. For Patient 2, we exclusively used R9.4 runs and performed only 13 runs (1.89M reads), which required ~5 days of sequencing on seven parallel MinION instruments at a cost of around $7000 2 (Supplementary Fig. 1), and produced a coverage of 11×. We observed that 82.1% (Patient 1) and 98.9% (Patient 2) of these reads could be mapped to the human reference genome and were useful for further analyses. Read lengths were highly variable for Patient 1, as a result of differences in library prep methods, with a mean of 6.9 kb for template reads, while for Patient 2 we reached an average of 16.2 kb with consistent (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41467-017-01343-4.pdf
Article home page: https://www.nature.com/articles/s41467-017-01343-4

Mircea Cretu Stancu, Markus J. van Roosmalen, Ivo Renkens, Marleen M. Nieboer, Sjors Middelkamp, Joep de Ligt, Giulia Pregno, Daniela Giachino, Giorgia Mandrile, Jose Espejo Valle-Inclan, Jerome Korzelius, Ewart de Bruijn, Edwin Cuppen, Michael E. Talkowski, Tobias Marschall, Jeroen de Ridder, Wigard P. Kloosterman. Mapping and phasing of structural variation in patient genomes using nanopore sequencing, Nature Communications, 2017, Issue: 8, DOI: 10.1038/s41467-017-01343-4