Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples (pdf)

Article PDF cannot be displayed. You can download it here:

https://nar.oxfordjournals.org/content/38/13/e142.full.pdf

Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples

Andrew M. Smith 0 1 2 Lawrence E. Heisler 0 6 Robert P. St.Onge 4 5 Eveline Farias-Hesson 3 Iain M. Wallace 0 1 John Bodeau 8 Adam N. Harris 7 Kathleen M. Perry 8 Guri Giaever 0 2 6 Nader Pourmand 3 4 Corey Nislow 0 1 2 0 Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street , Toronto , Ontario M5S 3E1 1 Banting and Best Department of Medical Research, University of Toronto, 112 College Street , Toronto , Ontario M5G 1L6 2 Department of Molecular Genetics, University of Toronto, 1 King's College Circle , Toronto , Ontario M5S 1A8 3 Biomolecular Engineering, University of California at Santa Cruz , Santa Cruz, CA 95064 4 Stanford Genome Technology Center, Stanford University , Palo Alto, CA 94304 5 Department of Biochemistry, Stanford University , Stanford, CA 94305 6 Department of Pharmaceutical Sciences, University of Toronto, 144 College Street , Toronto , Ontario M5S 3M2, Canada 7 Life Technologies Corporation, 5791 Van Allen Way, Carlsbad, CA 92009, USA 8 Life Technologies Corporation, 850 Lincoln Centre Drive , Foster City, CA 94404 *To whom correspondence should be addressed. Tel: +1 416 946 8351; Fax: +1 416 978 8287; Email: Correspondence may also be addressed to Nader Pourmand. Tel: +1 831 502 7315; Fax: +1 831 459 2891; Email: The Author(s) 2010. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. - Next-generation sequencing has proven an extremely effective technology for molecular counting applications where the number of sequence reads provides a digital readout for RNA-seq, ChIP-seq, Tn-seq and other applications. The extremely large number of sequence reads that can be obtained per run permits the analysis of increasingly complex samples. For lower complexity samples, however, a point of diminishing returns is reached when the number of counts per sequence results in oversampling with no increase in data quality. A solution to making next-generation sequencing as efficient and affordable as possible involves assaying multiple samples in a single run. Here, we report the successful 96-plexing of complex pools of DNA barcoded yeast mutants and show that such Bar-seq assessment of these samples is comparable with data provided by barcode microarrays, the current benchmark for this application. The cost reduction and increased throughput permitted by highly multiplexed sequencing will greatly expand the scope of chemogenomics assays and, equally importantly, the approach is suitable for other sequence counting applications that could benefit from massive parallelization. Next-generation sequencing (NGS) technologies can generate up to several hundred million reads of DNA sequence per lane or slide, and this capacity continues to increase at a rapid pace. This massive capacity has allowed exploration of diverse biological questions (14). Although pooled chemogenomic screens of compound gene interactions in yeast (516) and mammalian cells (17,18) are typically assessed using barcode microarrays, counting of individual strains could also be assessed by barcode sequencing. We recently developed such an assay (Bar-seq) to monitor thousands of genechemical interactions (19). We now expand upon this proofof-principle to interrogate 96 samples in parallel, developing the methodology and analytical tools to use NGS to simultaneously monitor several hundred thousand geneenvironment interactions using a method that should be readily adaptable to an automated workflow. Here, we demonstrate successful multiplexing of samples obtained from 96 distinct pooled yeast growth assays, with each sample comprising 6200 uniquely barcoded yeast mutants. This 96-plex experiment represents a 150-fold increase in unique observations over our proof-of-principle assessment, and provides substantial cost reduction/experiment over microarrays. Furthermore, while many aspects of microarray assay costs are fixed, the cost of multiplex barcode sequencing continues to decline as the number of reads per experiment increases. Indeed, this increase in sequencing rate has recently been shown to outpace the rate of Moores law (20). To assess the data quality at this level of multiplexing, all 96 samples were also assessed by microarray and we then compared the ability of both platforms to detect specific compoundgene interactions. It is expected that the principle of this 96-fold multiplexing application, with its ability to discriminate many sample types/slide or flow cell can be applied, with modification, to other molecular counting methods such as RNA-seq (21), ChIP-seq (22), promoter assays (23), histone occupancy (24) and Tn-seq (25). To systematically test highly multiplexed Bar-seq, we required a large pool of distinct sequences whose relative abundances could be varied and whose quantities could also be assessed by an orthogonal method. The Yeast Knock Out collection of 6200 Saccharomyces cerevisiae mutants, although designed for testing gene function, provides a suitable test bed for new sequencing methods (19). Each yeast deletion mutant contains three salient features: a dominant drug resistance marker replacing the deleted gene; two unique 20 base molecular barcodes; and universal primers that flank each barcode to allow amplification of all barcodes in a pooled manner using a single set of primers. Pooled competitive growth assays are typically carried out on 6200 mutants, and their relative abundances inferred from the signal from a barcode microarray (516). The rapid pace of advance in sequencing depth have led us and others to exploring diverse strategies for multiplexing of samples for NGS samples (19,2534). One essential element for multiplexing prior to sequencing is the incorporation (in this instance using modified primers during PCR) of a unique experimental indexing tag (See Supplementary Figure S1 for structure of PCR amplicon). Following PCR, the amplified DNA is purified and quantified, then pooled with amplicons derived from other samples with different indexing tags. The pooled PCR products are then purified from a single lane of a polyacrylamide gel, reducing costs and sample preparation time. Further, combining samples prior to purification reduces potential liquid transfer errors, providing for greater uniformity, and also reducing the number of emulsion PCRs reactions required prior to di-nucleotide sequencing on the SOLiD V3 instrument. In our 20- and 96-plex sequencing runs, two independent reads were obtained for each feature (Supplementary Figure S1): the first sequence read was primed from the P1 adapter sequence, capturing the sequence of the first common primer (U1) and the yeast barcode. The second sequencing read, pr (...truncated)