Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species
BMC Genomics
Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species
Ming-Cheng Luo 0
Yaqin Ma 0
Frank M You 0
Olin D Anderson
David Kopeck
Hana imkov
Jan af
Jaroslav Doleel
Bikram Gill
Patrick E McGuire 0
Jan Dvorak 0
0 Department of Plant Sciences, University of California , Davis, CA 95616 , USA
Background: The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-informationcontent-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy. Results: The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries. Conclusions: The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes.
-
Background
Plant and animal genomes are currently sequenced
either by a global shotgun sequencing approach [1] or
by sequencing of large-insert genomic clones and
assembling the global genome sequence from them
(ordered-clone approach) [2]. The former approach is
inherently faster and more economical since the entire
genome sequence is generated in a single operation. To
assemble a genome sequence, it is necessary to identify
overlaps of individual reads among vast numbers of
other reads. The presence of repeated sequences among
the reads makes this task challenging in some genomes.
This aspect of genome architecture is greatly
exacerbated in plants with large genomes by the precipitous
turnover of repeated sequences in the intergenic spaces.
For instance, in the tribe Triticeae of the grass family, in
which the sizes of genomes in diploid species range
from 3.3 to 8.1 Gbp (reviewed in [3]), sequences filling
the intergenic space are almost entirely replaced in
about 3 million years, which is a turnover rate orders of
magnitude faster than in primate genomes [4]. Because
of large genome size and fast turnover rate of repeated
sequences, the Triticeae genomes contain large numbers
of very similar nucleotide sequences, which has
precluded the use of the shotgun genome sequencing
approach for diploid Triticeae species.
A special challenge presented to genome sequencing
in plants is polyploidy. A large percentage of seed plants
are polyploid [5]. Probably all plants are ancient
polyploids (paleopolyploids) but since paleopolyploidy does
not usually complicate genome sequencing,
paleopolyploidy is not considered in this study. Plant polyploids
are categorized as either autopolyploids with identical
genomes or allopolyploids with related genomes that
were contributed by different diploid species. A vast
majority of plant polyploids are allopolyploids. The need
to allocate sequence reads to respective genomes makes
it exceedingly difficult to assemble global genome
sequences of polyploid species from whole-genome
shotgun sequence reads. For that reason, no polyploid
plant genome has yet been sequenced by this approach.
The alternative approach, based on sequencing
largeinsert clones, potentially avoids the factors limiting the
shotgun sequencing approach. The advent of the
highinformation-content-fingerprinting (HICF) of bacterial
artificial chromosome (BAC) clones greatly increased
fingerprinting throughput and fidelity [6-8]. With the
five-color SNaPshot HICF technology [8],
computer-driven fingerprint editing [9], contig assembly with the
FPC program [10,11], and contig anchoring on
highresolution genetic maps with the highly multiplexed
Illumina GoldenGate assays [12], it is now theoretically
possible to construct physical maps for most diploid
plants and animals, including ancient polyploids, such as
maize and soybean [13,14].
The SNaPshot HICF fingerprinting technology is
based on restriction digestion of the DNA of each BAC
clone by multiple restriction endonucleases and sizing a
portion of the fragments with capillary electrophoresis.
Contigs are then assembled on the basis of shared
portions of the restriction profiles of the BAC clones. It has
been tacitly assumed that BAC clones from
homoeologous chromosome regions in an allopolyploid will have
too many restriction fragments in common and will be
included into single contigs during contig assembly.
Consequently, physical mapping based on the SNaPshot
HICF technology has not been pursued to any
significant extent in recently evolved allopolyploids, with the
sole exception of hexaploid wheat, Triticum aestivum.
Polyploid wheat species of economical importance are
either allotetraploid (T. turgidum, genome formula
AABB) or allohexaploid (T. aestivum, genome formula
AABBDD). The A, B, and D genomes were contributed
by three different diploid species which radiated from a
common ancestor between 2.5 and 4.5 million years
ago, depending on which of several es (...truncated)