De novo characterization of a whitefly transcriptome and analysis of its gene expression during development
BMC Genomics
RDeseearnchoarvtiocle characterization of a whitefly transcriptome and analysis of its gene expression during development
Xiao-Wei Wang 0
Jun-Bo Luan 0
Jun-Min Li 0
Yan-Yuan Bao 0
Chuan-Xi Zhang 0
Shu-Sheng Liu 0
0 Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University , Hangzhou 310029 , China
Background: Whitefly (Bemisia tabaci) causes extensive crop damage throughout the world by feeding directly on plants and by vectoring hundreds of species of begomoviruses. Yet little is understood about its genes involved in development, insecticide resistance, host range plasticity and virus transmission. Results: To facilitate research on whitefly, we present a method for de novo assembly of whitefly transcriptome using short read sequencing technology (Illumina). In a single run, we produced more than 43 million sequencing reads. These reads were assembled into 168,900 unique sequences (mean size = 266 bp) which represent more than 10-fold of all the whitefly sequences deposited in the GenBank (as of March 2010). Based on similarity search with known proteins, these analyses identified 27,290 sequences with a cut-off E-value above 10-5. Assembled sequences were annotated with gene descriptions, gene ontology and clusters of orthologous group terms. In addition, we investigated the transcriptome changes during whitefly development using a tag-based digital gene expression (DGE) system. We obtained a sequencing depth of over 2.5 million tags per sample and identified a large number of genes associated with specific developmental stages and insecticide resistance. Conclusion: Our data provides the most comprehensive sequence resource available for whitefly study and demonstrates that the Illumina sequencing allows de novo transcriptome assembly and gene expression analysis in a species lacking genome information. We anticipate that next generation sequencing technologies hold great potential for the study of the transcriptome in other non-model organisms.
-
Background
The whitefly Bemisia tabaci (Gennadius) is a genetically
diverse complex containing some of the most destructive
invasive pests of many ornamental and glasshouse crops
worldwide [1,2]. The species complex colonizes more
than 600 different species of plants, transmits many plant
viruses, feeds on phloem sap, and promotes the growth of
damaging fungi on honeydew excretions deposited on
plants [3-6]. Recent phylogenetic analysis combined with
a pattern of reproductive isolation among genetic groups
within B. tabaci indicate that the complex contains at
least 24 cryptic species, some of which have been referred
to as "biotypes" in the last 20 years [7,8]. As the
separation at the species level within the B. tabaci complex is
yet to be fully resolved, we have retained the commonly
used term "biotype" to link this study with existing
literature. The most predominant and damaging biotypes of B.
tabaci are the B and Q biotypes [9,10]. While the former
is known for its high fitness parameters, the Q biotype
whitefly has a unique ability to develop and maintain high
levels of resistance to major classes of insecticides owing
to biological and genetic factors [11,12].
Despite its global importance, genomic sequence
resources available for the whitefly are scarce, especially
for the Q biotype. Currently (March 30th, 2010), there are
about 9110 EST and 762 nucleotide sequences available
on NCBI for the B biotype whitefly, and only 683
nucleotide sequences have been deposited for the Q biotype
whitefly. The previous EST sequencing efforts for the B
biotype whitefly have allowed the development of
smallscale microarrays for gene expression analysis in the
context of insecticide resistance and parasitoid-whitefly
interactions [13-15]. While these studies have highlighted
the utility of cDNA sequencing for candidate gene
discovery in the absence of a genome sequence, a
comprehensive description of the genes expressed in
insecticideresistant Q biotype whitefly remains unavailable.
Over the past several years, the next generation
sequencing technology has emerged as a cutting edge
approach for high-throughput sequence determination
and this has dramatically improved the efficiency and
speed of gene discovery [16,17]. For example, the
Illumina sequencing technology is able to generate over one
billion bases of high-quality DNA sequence per run at
less than 1% of the cost of capillary-based methods [18].
Furthermore, this next generation sequencing has also
significantly accelerated and improved the sensitivity of
gene-expression profiling and, is expected to boost
collaborative and comparative genomics studies [19,20].
Previously, Illumina sequencing of transcriptomes for
organisms with completed genomes confirmed that the
relatively short reads produced can be effectively
assembled and used for gene discovery and comparison of gene
expression profiles [21,22]. Despi (...truncated)