WBSA: Web Service for Bisulfite Sequencing Data Analysis
Citation: Liang F, Tang B, Wang Y, Wang J, Yu C, et al. (
WBSA: Web Service for Bisulfite Sequencing Data Analysis
Fang Liang. 0
Bixia Tang. 0
Yanqing Wang 0
Jianfeng Wang 0
Caixia Yu 0
Xu Chen 0
Junwei Zhu 0
Jiangwei Yan 0
Wenming Zhao 0
Rujiao Li 0
Matteo Pellegrini, UCLA-DOE Institute for Genomics and Proteomics, United States of America
0 Beijing Institute of Genomics, Chinese Academy of Sciences , Beijing , China
Whole-Genome Bisulfite Sequencing (WGBS) and genome-wide Reduced Representation Bisulfite Sequencing (RRBS) are widely used to study DNA methylation. However, data analysis is complicated, lengthy, and hampered by a lack of seamless analytical pipelines. To address these issues, we developed a convenient, stable, and efficient web service called Web Service for Bisulfite Sequencing Data Analysis (WBSA) to analyze bisulfate sequencing data. WBSA focuses on not only CpG methylation, which is the most common biochemical modification in eukaryotic DNA, but also non-CG methylation, which have been observed in plants, iPS cells, oocytes, neurons and stem cells of human. WBSA comprises three main modules as follows: WGBS data analysis, RRBS data analysis, and differentially methylated region (DMR) identification. The WGBS and RRBS modules execute read mapping, methylation site identification, annotation, and advanced analysis, whereas the DMR module identifies actual DMRs and annotates their correlations to genes. WBSA can be accessed and used without charge either online or local version. WBSA also includes the executables of the Portable Batch System (PBS) and standalone versions that can be downloaded from the website together with the installation instructions. WBSA is available at no charge for academic users at http://wbsa.big.ac.cn.
-
Funding: This work was supported by the Natural Science Foundation of China (31000584 to RL), (http://www.nsfc.gov.cn/Portal0/default152.htm); and the
Apparatus Function Innovation Program of the Chinese Academy of Sciences (201212 to WZ), (http://www.cas.cn/); The funders had no role in study design, data
collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
. These authors contributed equally to this work.
DNA methylation plays an important role in cell differentiation,
X chromosome inactivation, genomic imprinting through
regulation of transcription, chromatin structure, chromosome stability,
and tumorigenesis [1,2,3,4,5,6]. DNA methylation research has
been accelerated by the development of next-generation
sequencing technology, and it is now the focus of research groups with
diverse interests [5,7,8,9,10,11,12]. There are four mainstream
sequencing-based methods for DNA methylation profiling: two
utilize enrichment of methylated DNA (Methylated DNA Binding
Domain sequencing, or MBD-seq [7] and Methylated DNA
Immunoprecipitation sequencing, or MeDIP-seq [8,9]), and the
other two utilize bisulfite conversion (MethylC-seq or WGBS [10]
and Reduced Representation Bisulfite Sequencing or RRBS [11]).
Reacting DNA with bisulfite converts cytosine residues to uracil
residues but does not alter 5-methylcytosine residues, which makes
it possible to distinguish methylated from unmethylated cytosine
residues. Because bisulfite sequencing determines single base
changes, its resolution is greater than those methods that utilize
DNA enriched in methylated regions [12].
WGBS and RRBS are widely used in biological research [5,12].
The read alignment of bisulfite sequencing data differs from that
generated using non-bisulfite sequencing due to the change of C to
T residues. Therefore, alignment tools such as BSMAP [13], BS
SEEKER [14], RMAP [15], and Bismark [16] were developed to
address this issue. Further, common alignment tools such as BWA
[17] and Bowtie [18,19] also align bisulfite sequencing reads to a
reference sequence after the reads and the reference sequences
have been converted [10]. Other tools are available for analyzing
bisulfite sequencing data, such as CyMATE [20], CpG
PatternFinder [21], GBSA [22], COHCAP [23], methylKit [24], and
BSmooth [25]. Their applications are limited because their
analytical pipelines either require aligned reads as input or only
support single-end alignments. Moreover, these tools only identify
methylated cytosines or only analyze methylated CpG islands to
search for correlations between methylation and gene expression.
SAAP-RRBS [26] and RRBS-Analyser [27] integrated BSMAP as
an alignment tool and acted as streamlined analysis and
annotation pipelines. However, these approaches are designed
only for RRBS data with limited annotations. Certain tools
identify differentially methylated regions [24,25,27,28], but most
do not focus on analysis of non-CGs (Table 13).
We describe here WBSA, which provides a user-friendly and
novel web service for analyzing bisulfite sequencing data. WBSA
focuses on the analysis of CpG as (...truncated)