Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species
Chengcang Wu
2
3
Dina Proestou
1
Dorothy Carter
1
Erica Nicholson
1
Filippe Santos
2
Shaying Zhao
0
Hong-Bin Zhang
2
Marian R Goldsmith
1
0
The Institute for Genomic Research
,
9712 Medical Center Dr, Rockville, MD 20850
,
USA
1
Department of Biological Sciences, University of Rhode Island
,
Kingston, RI 02881-0816
,
USA
2
Department of Soil and Crop Sciences, Texas A&M University, College Station
,
TX 77843-2474
,
USA
3
Current address: Lucigen Corporation
,
2120 West Greenview Dr, Middleton, WI 53562
,
USA
Background: Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results: We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152-175 kb. We estimated that the genome coverage of each library ranged from 6-9 , with the two combined libraries of each species being equivalent to 13.0-16.3 haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion: The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes.
-
Background
Large-insert bacterial artificial chromosome (BAC)
libraries have been shown to be critical resources for many
aspects of molecular and genomic studies [1,2], such as
the positional cloning of genes [3] and quantitative trait
loci [4], comparative studies of synteny and gene
organization among different species [5], as well as for local or
whole genome physical and genetic mapping and
sequencing [6-11]. Arrayed, large-insert DNA libraries
have provided the opportunity for researchers to analyze
and share information and resources on specific clones
[1,2,12,13]. Hundreds of BAC libraries have been
constructed for microbe, plant and animal species
[1,2,6,7,12,13]. However, only a few large-insert BAC
libraries are available to date for insect species, especially
lepidopteran insects [10,11,14-17]. This could slow
progress for the comprehensive molecular and genomics
research of these clades.
Moths and butterflies, members of the insect order
Lepidoptera, are the second most diverse group of animals,
with at least 150,000 named species [18]. They are
widespread members of the ecosystem, playing important
roles as pollinators and prey, and are among the most
destructive agricultural pests. Clearly, Lepidoptera are
under-represented in terms of genomic resources and
knowledge relative to their biological and economic
status. This research was designed mainly to construct
comprehensive BAC library resources for two species of moths,
the tobacco hornworm, Manduca sexta and the tobacco
budworm, Heliothis virescens, and one species of butterfly,
the Mllerian mimic, Heliconius erato. These species have
genome sizes ranging from 400 to 500 Mb/haploid
genome (395 Mb for H. erato [19], 404 Mb for H. virescens
[20], and 500 Mb for M. sexta [J. S. Johnston, pers.
communication]) and are widely-used models for studying
fundamental problems in neurobiology [21], olfaction
[22], development [23], and immune responses [24] (M.
sexta]; host feeding preferences [25] and evolution of
insecticide resistance [26] and sexual communication
systems [27] (H. virescens); and wing pattern mimicry [(H.
erato) [28]. Moths and butterflies are estimated to have
diverged from each other at least 5060 million years ago
[18]. The sphingid, M. sexta, is a member of the same
superfamily, Bombycoidea, as the domesticated
silkworm, Bombyx mori, the current genome model for
Lepidoptera [8,9], and the noctuid, H. virescens, is related to
other pest noctuids currently being used for genomic
studies including Spodoptera frugiperda [16,29] and Helicoverpa
armigera [30]. Here, we report the construction and
characterization of six large-insert BAC libraries for these
species and the first insight into the constitution and
evolution of their genomes. The libraries will enable a
large community of scientists to isolate and study the
genes controlling these processes, provide new tools for
lepidopteran systematics, and serve as critical resources
for comparative genomic studies and genome sequencing
of this important group of organisms.
Results
Development of procedures for preparation of
highmolecular-weight (HMW) DNA
One of the most important steps toward construction of
high-quality BAC libraries is preparation of high-quality
megabase DNA. Since no procedure was available for
preparation of HMW DNA from these insects, we first
developed a method for megabase DNA preparation by
testing different DNA isolation buffer systems and tissues
collected at different developmental stages of the insects.
The results showed that the day-10 pupae (males and
females) of M. sexta and day-4 pupae (males and females)
of H. virescens and H. erato were most suitable for
megabase DNA isolation using a buffer system containing 0.1
M NaCl, 10 mM Tris-HCl, 10 mM EDTA, pH 9.4, and
0.15% -mercaptoethanol. The DNA isolated with this
method was not only large in size (> 1000 kb), but also
readily digestible and clonable, thus being well-suited for
BAC library construction.
Library construction
The major goal of this study was to develop BAC resources
that are widely usable for molecular and genomic studies
of the insects, including whole gen (...truncated)