Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis
Zhu et al. BMC Genomics (2016) 17:557
DOI 10.1186/s12864-016-2870-4
RESEARCH ARTICLE
Open Access
Genome wide characterization of simple
sequence repeats in watermelon genome
and their application in comparative
mapping and genetic diversity analysis
Huayu Zhu1†, Pengyao Song1†, Dal-Hoe Koo2, Luqin Guo1, Yanman Li1, Shouru Sun1, Yiqun Weng2,3*
and Luming Yang1*
Abstract
Background: Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant
genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing
with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop
numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species
comparisons and allow investigation of karyotype and genome evolution through highly efficient computation
approaches such as in silico PCR. Here we described genome wide development and characterization of SSR
markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other
important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We
further applied these markers in evaluating the genetic diversity and population structure in watermelon
germplasm collections.
Results: A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall
density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide
SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most
abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having
a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these crossspecies SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species.
In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these
lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in
one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis clustered in
another group. Furthermore, structure analysis was consistent with the dendrogram indicating the 134 watermelon
accessions were classified into two populations.
(Continued on next page)
* Correspondence: ;
†
Equal contributors
2
Horticulture Department, University of Wisconsin, Madison, WI 53706, USA
1
College of Horticulture, Henan Agricultural University, 63 Nongye Road,
Zhengzhou 450002, China
Full list of author information is available at the end of the article
© 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Zhu et al. BMC Genomics (2016) 17:557
Page 2 of 17
(Continued from previous page)
Conclusion: The large number of genome wide SSR markers developed herein from the watermelon genome provides
a valuable resource for genetic map construction, QTL exploration, map-based gene cloning and marker-assisted
selection in watermelon which has a very narrow genetic base and extremely low polymorphism among cultivated
lines. Furthermore, the cross-species transferable SSR markers identified herein should also have practical uses in many
applications in species of Cucurbitaceae family whose whole genome sequences are not yet available.
Keywords: SSR, Watermelon, Comparative genomics, Synteny, Cucurbits, Genetic diversity
Background
Watermelon (Citrullus lanatus) is an important horticultural crop and one of the most consumed fresh
fruits globally. It belongs to the genus Citrullus, which
contains four diploid species: Citrullus lanatus
(Thunb.) Mat-sum. & Nakai, C. colocynthis (L.) Schrad,
C. ecir rhosus Cogn. and C. rehmii De Winter [1, 2].
Among these four species, Citrullus lanatus includes
the cultivated watermelon (C. lanatus var. lanatus)
which thrives in West Africa and has been cultivated
widely worldwide (also called ‘egusi’ melon) and the
preserving melon (C. lanatus var. citroides) that is
grown in Southern Africa (also called ‘tsamma’ melon)
[3, 4], and C. colocynthi (‘bitter apple’) is a perennial
species grown in sandy areas throughout northern
Africa, south-western Asia, and the Mediterranean [2, 5].
The long term domestication and selection for desirable
horticultural qualities has made the cultivated watermelon
with a narrow genetic base and susceptibility to a large
number of diseases and pests [6]. Evaluating the phylogenetic relationships among different species in Citrullus
genus will help us for improving watermelon cultivars in
diseases resistance [1]. Watermelon has a small genome of
425 Mb, and the genome of the elite Chinese watermelon
line 97103 [7] and the American heirloom watermelon
cultivar Charleston Gray have been sequenced and
released in cucurbit genomics database (www.icugi.org).
The availability of these genomic resources of watermelon
have greatly promoted the fundamental researches including the development of molecular markers and genetic
map construction [8, 9], gene/QTL mapping [10, 11],
molecular breeding, and comparative genomics [12].
Microsatellites or simple sequence repeats (SSRs), are
one of the most commonly used marker in many genetic
applications since the early 1990s including mapping,
fingerprinting, genetic diversity and population structure
analysis [13–16]. Because of their reproducibility, multiallelism, co-dominance, relative abundance, good genome
coverage and versatile platforms to genotype, the use of
microsatellites is likely to continue to be used for some
years to come. Furthermore, they are comparatively cheap
to genotype and provide more population genetic information per marker than bi-allelic markers such as single
nucleotide polymorphisms (SNP) [17, 18]. A single set of
microsatellite markers can be used to genotype several related species, but SNP markers in general lack crossspecies utility, and are therefore only suitable for population and paternity studies in a single species [19–21]. The
microsatellite loci can be detected both in genomic sequences and expressed sequence tag (EST), which were
named genomic SSRs and EST-SSR. EST-SSRs are useful
for genetic analysis, but their relati (...truncated)