Global comparative analysis of ESTs from the southern cattle tick, Rhipicephalus (Boophilus) microplus
Minghua Wang
0
2
4
Felix D Guerrero
0
2
3
Geo Pertea
0
1
2
Vishvanath M Nene
0
1
2
0
ern cattle tick, is one of the most economically important tick vectors of pathogens that affect the global cattle pop-
1
The J. Craig Venter Institute
,
9712 Medical Center Drive, Rockville, Maryland 20850
,
USA
2
Background Rhipicephalus (Boophilus) microplus, the tropical or south-
3
USDA-ARS, Knipling Bushland U.S. Livestock Insect Research Laboratory;
2700 Fredericksburg Rd., Kerrville, TX 78028
,
USA
4
Lorus Therapeutics Inc;
2 Meridian Road, Toronto, ON M9W 4Z7
,
Canada
Background: The southern cattle tick, Rhipicephalus (Boophilus) microplus, is an economically important parasite of cattle and can transmit several pathogenic microorganisms to its cattle host during the feeding process. Understanding the biology and genomics of R. microplus is critical to developing novel methods for controlling these ticks. Results: We present a global comparative genomic analysis of a gene index of R. microplus comprised of 13,643 unique transcripts assembled from 42,512 expressed sequence tags (ESTs), a significant fraction of the complement of R. microplus genes. The source material for these ESTs consisted of polyA RNA from various tissues, lifestages, and strains of R. microplus, including larvae exposed to heat, cold, host odor, and acaricide. Functional annotation using RPS-Blast analysis identified conserved protein domains in the conceptually translated gene index and assigned GO terms to those database transcripts which had informative BlastX hits. Blast Score Ratio and SimiTri analysis compared the conceptual transcriptome of the R. microplus database to other eukaryotic proteomes and EST databases, including those from 3 ticks. The most abundant protein domains in BmiGI were also analyzed by SimiTri methodology. Conclusion: These results indicate that a large fraction of BmiGI entries have no homologs in other sequenced genomes. Analysis with the PartiGene annotation pipeline showed 64% of the members of BmiGI could not be assigned GO annotation, thus minimal information is available about a significant fraction of the tick genome. This highlights the important insights in tick biology which are likely to result from a tick genome sequencing project. Global comparative analysis identified some tick genes with unexpected phylogenetic relationships which detailed analysis attributed to gene losses in some members of the animal kingdom. Some tick genes were identified which had close orthologues to mammalian genes. Members of this group would likely be poor choices as targets for development of novel tick control technology.
-
ulation [1]. The tick transmits protozoan (Babesia bovis
and Babesia bigemina) and prokaryotic (Anaplasma
marginale) organisms that cause babesiosis and anaplasmosis,
which can result in severe agricultural losses in milk and
beef production and restriction in traffic of livestock. The
impact of R. microplus upon the US cattle industry was
such that the US Department of Agriculture (USDA) led a
campaign in the mid-20th century which eradicated the
tick from the US [2]. The tick remains prevalent in Mexico
and, since over a million cattle are imported annually into
the US from Mexico, an extensive USDA quarantine
program is in place to keep Boophilus ticks from reestablishing
in the US [3].
Acaricides play a critical role in maintaining the success of
the USDA quarantine program and in controlling tick
infestations in Mexico and other parts of the world.
However, reports of acaricide resistant R. microplus populations
in Mexico [4,5] and R. microplus outbreaks in the US [6]
highlight the need for development of novel tick control
methodologies. Understanding the genome and the gene
expression profile of the tick should facilitate the
development of these control technologies. Several reports have
described projects centered on the acquisition and
analysis of tick expressed sequence tags (ESTs). Most of the
reports focused on the genes transcribed in the salivary
glands of ticks such as Rhipicephalus appendiculatus [7],
Amblyomma variegatum [8] and Ixodes scapularis [9].
Additionally, the isolation of 1,344 ESTs from ovaries, salivary
glands and hemocytes of R. microplus has been reported,
however, the sequences have not been submitted to
Genbank [10]. Genes expressed in salivary glands and ovaries
are attractive targets for study because these tissues are
involved in critical tick-host-pathogen interactions. In a
more general approach, we have developed a R. microplus
EST database, BmiGI [11], derived from various tissues,
lifestages and tick strains, to facilitate research using
molecular biological and genomic approaches to design
novel tick control technologies. It is hoped the analysis of
the database will lead to discovery of genes which can
overcome tick control problems due to acaricide
resistance and identify gene-based vulnerabilities in the
processes involved in pathogen infection and transmission. In
BmiGI Version 1, 53 putative acaricide
resistance-associated sequences were identified. In the present study, we
have assembled an updated gene index [12] which
contains more than double the number of ESTs of Version 1.
We present the Gene Ontology (GO) annotation analysis
and RPS-Blast identification of conserved protein
domains from BmiGI Version 2. Using the comparative
genomics analytical tools Blast Score Ratio [13] and
SimiTri [14] which provide visual outputs to allow global
comparisons between genomes, we compared the
proteome resulting from the conceptual translation of the R.
microplus EST database with the proteomes from Homo
sapiens, Anopheles gambiae, Drosophila melanogaster,
Caenorhabditis elegans, and Saccharomyces cerevisiae. We
also performed more detailed SimiTri comparisons using
several of the most abundant protein domains in the
proteome of R. microplus.
Results and discussion
BmiGI statistics and GO annotation
In the first version of BmiGI, ESTs were clustered and
assembled into tentative consensus (TC) sequences using
TIGR's autoannotation pipeline tools, and non-clustered,
non-overlapping sequences defined as singleton
sequences. A total of 20,417 ESTs were analyzed and the
assembly yielded 8,270 unique members, including 5,760
TCs and 2,510 singleton ESTs [11]. In the second version
of BmiGI, the total number of new ESTs sequenced was
22,095. These new sequences were combined with the
ESTs in the BmiGI Version 1 for clustering to generate
BmiGI Version 2, resulting in 9,403 TCs and 4,240
singletons.
The number of novel sequences obtained significantly
decreased as EST sequencing proceeded. The first 20,417
ESTs resulted in 8,270 unique members of BmiGI, a
return rate of 41%. The second set, comprised of 22,095
ESTs, resulted in an additional 5,373 new members of
BmiGI, a return rate of 24%. By the final stages of the
second round of EST sequencing, a return rate of
approximately 5% was being observed and further EST
sequ (...truncated)