Genome Physical Mapping of Polyploids: A BIBAC Physical Map of Cultivated Tetraploid Cotton, Gossypium hirsutum L
Gossypium hirsutum L.. PLoS ONE 7(3): e33644. doi:10.1371/journal.pone.0033644
Genome Physical Mapping of Polyploids: A BIBAC Physical Map of Cultivated Tetraploid Cotton, Gossypium hirsutum L.
Meiping Zhang 0 1
Yang Zhang 0 1
James J. Huang 0 1
Xiaojun Zhang 0 1
Mi-Kyung Lee 0 1
David M. Stelly 0 1
Hong-Bin Zhang 0 1
Christian Scho nbach, Kyushu Institute of Technology, Japan
0 Current address: Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences , Qingdao , China
1 1 Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America, 2 College of Life Science, Jilin Agricultural University , Changchun, Jilin , China
Polyploids account for approximately 70% of flowering plants, including many field, horticulture and forage crops. Cottons are a world-leading fiber and important oilseed crop, and a model species for study of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. This study has addressed the concerns of physical mapping of polyploids with BACs and/or BIBACs by constructing a physical map of the tetraploid cotton, Gossypium hirsutum L. The physical map consists of 3,450 BIBAC contigs with an N50 contig size of 863 kb, collectively spanning 2,244 Mb. We sorted the map contigs according to their origin of subgenome, showing that we assembled physical maps for the A- and D-subgenomes of the tetraploid cotton, separately. We also identified the BIBACs in the map minimal tilling path, which consists of 15,277 clones. Moreover, we have marked the physical map with nearly 10,000 BIBAC ends (BESs), making one BES in approximately 250 kb. This physical map provides a line of evidence and a strategy for physical mapping of polyploids, and a platform for advanced research of the tetraploid cotton genome, particularly fine mapping and cloning the cotton agronomic genes and QTLs, and sequencing and assembling the cotton genome using the modern next-generation sequencing technology.
-
Funding: This study was supported by an internal fund of Zhang laboratory (203232-86360) and the research grants of Texas AgriLife Research (124475-85360
and 124475-70360). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
. These authors contributed equally to this work.
Polyploidy is a significant evolutionary process in higher
organisms. It has long been recognized as a prominent speciation
process in plants as well as some fishes [1,2]. The genomes of most
angiosperms are thought to have incurred one or more
polyploidization events during evolution [3]. Studies have
demonstrated that genome doubling has also been significant in
the evolutionary history of all vertebrates and in many other
eukaryotes [47]. It is estimated that about 70% of the extant
angiosperms are polyploids, including many world-leading field,
forage, horticultural and environmental crops such as cotton,
wheat, potatoes, canola, sugarcane, oats, peanut, tobacco, rose,
alfalfa, coffee and banana. Nevertheless, genomics research of
polyploids is generally behind that of diploid species due to their
polyploidy nature that could significantly complicate genome
research, especially genome physical mapping with large-insert
bacterial artificial chromosome (BAC) and/or
transformationcompetent binary BAC (BIBAC) clones. BAC and/or
BIBACbased genome physical maps have been demonstrated to be the
centerpiece essential for many areas of advanced studies such as
gene and quantitative trait locus (QTL) fine mapping and cloning,
genome sequencing, functional genomics, and comparative
genomics. Therefore, genome-wide physical maps have been
developed from BACs and/or BIBACs for a number of diploid
species [823]. However, no physical map has been developed and
no genome sequenced to date for a polyploid species though the
feasibility of constructing a physical map of a polyploidy plant
species by BAC fingerprint analysis was tested using an in silico
merged BAC library of two wheat homoeologous arms, 3AS and
3DS [24]. This study has addressed the concerns of genome
physical mapping of polyploids with BACs and/or BIBACs using
Upland cotton, Gossypium hirsutum L.
Upland cotton is an allotetraploid, consisting of A- and
Dsubgenomes, and has a genome size of approximately 2,400 Mb/
1C [25]. It was originated around 12 million years ago via
allopolyploidization between a diploid species containing an A
genome such as G. herbaceum (A1) or G. arboreum (A2) and a diploid
species containing a D genome such as G. raimondii (D5) or G.
gossypioides (D6), whereas the A- and D-subgenomes are
homoeologous [26], their diploid progenitors having splided from a
common ancestor some 57 million years ago [2731].
Cottons are a world leading fiber and oilseed crop, the textile
and bioenergy industries feed-stocked by cotton fibers and oilseeds
perhaps contributing thousands of billion dollars to the worlds
economy. Upland cotton economically is the most important
among the four cultivated cotton species, G. hirsutum (AD1), G.
barbadense (AD2), G. herbaceum (A1) and G. arboreum (A2), providing
over 90% of the worlds cotton fibers and oilseeds. Furthermore,
since the cotton polyploid complex consists of extant
allotetraploids (including Upland cotton) and diploid relatives (for review,
see [32]), it has long been used as a model species for studies of
plant polyploidization, speciation and evolution. Finally and
importantly, cotton fibers are a model system for studies of
cellulose biosynthesis that is crucial to bioenergy production and
plant cell wall biogenesis that makes the largest portion of biomass
on the earth. This is because cotton fibers are originated from
single individual cells and approximately 90% of their makeup is
celluloses that are the largest component of plant cell walls.
Therefore, cotton genomics research is of significance in numerous
aspects economically and scientifically.
Cotton genome research has been pursued extensively in the
past 20 years. A large number of DNA markers and several genetic
maps have been constructed, hundreds of QTLs important to fiber
yield and quality mapped, a large collection of expressed sequence
tags (ESTs) generated and several large-insert BAC and BIBAC
libraries developed for cotton (for review, see ref. 32). Recently, a
draft physical map has been developed from BACs [33] and
whole-genome draft sequences generated for a wild diploid relative
of the Upland cotton D-subgenome, G. raimondii (D5) (http://www.
ncbi.nlm.nih.gov/sra/SRA024364?report=full). Nevertheless, the
D genome of the wild species was too diverged to be claimed as the
diploid donor of the Upland cotton D-subgenome [34] and could
not be used to study the molecular basis of the economically
important cotton fiber yield and quality. That the genome of the
wild speci (...truncated)