ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis

BMC Genomics, Feb 2011

Background Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. Results Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. Conclusions Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://www.biomedcentral.com/content/pdf/1471-2164-12-134.pdf

ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis

BMC Genomics ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis Joshua WK Ho 0 2 3 Eric Bishop 1 2 Peter V Karchenko 0 2 3 4 Nicolas Ngre 5 Kevin P White 5 Peter J Park 0 2 3 4 0 Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School , Boston, MA , USA 1 Program in Bioinformatics, Boston University , Boston, MA , USA 2 Center for Biomedical Informatics, Harvard Medical School , Boston, MA , USA 3 Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School , Boston, MA , USA 4 Informatics Program, Children's Hospital , Boston, MA , USA 5 Institute for Genomics and Systems Biology, University of Chicago , Chicago, IL , USA Background: Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or highthroughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. Results: Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. Conclusions: Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis. - Background Chromatin immunoprecipitation (ChIP) followed by genomic tiling microarray hybridization (ChIP-chip) or massively parallel sequencing (ChIP-seq) are two of the most widely used approaches for genome-wide identification and characterization of in vivo protein-DNA interactions. They can be used to analyze many important DNA-interacting proteins including RNA polymerases, transcription factors, transcriptional co-factors, and histone proteins [1]. Indeed these genome-wide ChIP analysis approaches have led to many important discoveries related to transcriptional regulation [2-4], epigenetic regulation through histone modification [5], nucleosome organization [6,7], and interindividual variation in protein-DNA interactions [8,9]. ChIP-chip first appeared in the literature about 10 years ago and was one of the earliest approaches to performing genome-wide mapping of protein-DNA interactions in organisms with small genomes, such as yeast [2,10]. Currently, various tiling microarray platforms of common model organisms are well supported by commercial vendors, and many bioinformatics tools have been developed for ChIP-chip analysis [11-14]. Fueled by rapid development of the second generation high-throughput sequencing technologies in the past few years, ChIP-seq has emerged as an attractive alternative to ChIP-chip [1]. For instance, ChIP-seq generally produces profiles with higher spatial resolution, dynamic range, and genomic coverage, allowing it to have higher sensitivity and specificity over ChIP-chip in terms of protein binding site identification. Further, ChIP-seq can be used to analyze virtually any species with a sequenced genome since it is not constrained by the availability of an organism-specific microarray. Many current ChIP-seq protocols can work with a smaller amount of initial material compared to ChIP-chip [15,16]. Moreover, ChIP-seq is already a more costeffective way of analyzing mammalian genomes, and the cost effectiveness will likely become more apparent as the cost of high-throughput sequencing technology continues to drop. These factors have led to the rapid adoption of ChIP-seq technology. However, despite the widespread use of both ChIPchip and ChIP-seq, only a few small-scale studies have attempted to quantitatively compare these technologies using real data. Euskirchen et al. [17] compared the STAT1 bin (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/1471-2164-12-134.pdf

Joshua WK Ho, Eric Bishop, Peter V Karchenko, Nicolas Nègre, Kevin P White, Peter J Park. ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis, BMC Genomics, 2011, pp. 134, 12, DOI: 10.1186/1471-2164-12-134