Frequent germplasm exchanges drive the high genetic diversity of Chinese-cultivated common apricot germplasm
Zhang et al. Horticulture Research (2021)8:215
https://doi.org/10.1038/s41438-021-00650-8
ARTICLE
Horticulture Research
www.nature.com/hortres
Open Access
Frequent germplasm exchanges drive the high
genetic diversity of Chinese-cultivated common
apricot germplasm
1234567890():,;
1234567890():,;
1234567890():,;
1234567890():,;
Qiuping Zhang 1, Diyang Zhang 2, Kang Yu 3, Jingjing Ji3, Ning Liu1, Yuping Zhang1, Ming Xu1, Yu-Jun Zhang1,
Xiaoxue Ma1, Shuo Liu1, Wei-Hong Sun2, Xia Yu2, Wenqi Hu2, Si-Ren Lan2, Zhong-Jian Liu2,4,5 ✉ and Weisheng Liu1 ✉
Abstract
The genetic diversity of germplasm is critical for exploring genetic and phenotypic resources and has important
implications for crop-breeding sustainability and improvement. However, little is known about the factors that shape
and maintain genetic diversity. Here, we assembled a high-quality chromosome-level reference of the Chinese
common apricot ‘Yinxiangbai’, and we resequenced 180 apricot accessions that cover four major ecogeographical
groups in China and other accessions from occidental countries. We concluded that Chinese-cultivated common
apricot germplasms possessed much higher genetic diversity than those cultivated in Western countries. We also
detected seven migration events among different apricot groups, where 27% of the genome was identified as being
introgressed. Remarkably, we demonstrated that these introgressed regions drove the current high level of germplasm
diversity in Chinese-cultivated common apricots by introducing different genes related to distinct phenotypes from
different cultivated groups. Our results highlight the consideration that introgressed regions may provide an important
reservoir of genetic resources that can be used to sustain modern breeding programs.
Introduction
Crop germplasms, particularly those from the centers of
origin, provide critical resources for exploring and conserving genetic and phenotypic diversity for breeding
applications1,2. The diversity of crop germplasm has been
suggested to have important implications for breeding
sustainability and crop improvement, as it determines the
sustained ability of plant breeders to develop new highquality varieties3. Hence, it is essential to characterize the
factors driving and maintaining germplasm diversity in
Correspondence: Zhong-Jian Liu () or
Weisheng Liu ()
1
Liaoning Institute of Pomology, Yingkou 115009, China
2
Key Laboratory of National Forestry and Grassland Administration for Orchid
Conservation and Utilization at College of Landscape Architecture, Fujian
Agriculture and Forestry University, Fuzhou 350002, China
Full list of author information is available at the end of the article
These authors contributed equally: Qiuping Zhang, Diyang Zhang, Kang Yu,
and Jingjing Ji
crops, a consideration that has largely been ignored in
previous plant-breeding research.
The common apricot (Prunus armeniaca L.) belongs to
the subgenus Prunophora of the genus Prunus in the
Rosaceae family and has been widely grown in temperate
zones, primarily in its cultivated form. Documented evidence suggests that common apricots originate from
China and Central Asia and are dispersed outward4–6.
The common apricot germplasm in China has been
speculated to be the oldest, most diversified, and the most
currently underexplored resource7. Recorded in ancient
Chinese literature, the first apricot-cultivation event
occurred approximately 3000–4000 years ago8, and this
represents the earliest apricot domestication in the world.
The genetic structure of Chinese apricots that was
revealed by molecular markers supported the existence of
five major ecological groups, including those from North
China (NC), Northwest China (NWC), Northeast China
(NEC), Southeast China (SEC), and Xinjiang (XJ), with
© The Author(s) 2021
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction
in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if
changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Zhang et al. Horticulture Research (2021)8:215
frequent germplasm exchanges occurring among these
groups9. A long history of cultivation in combination with
varied ecological groups and frequent germplasm-migrant
events enables Chinese apricot to serve as an attractive
model to investigate the factors responsible for germplasm diversity in crops.
Here, we report a chromosomal-level genome assembly
of P. armeniaca “Yinxiangbai”, and we resequenced the
whole genomes of 150 apricot accessions from China that
covered four of the five major ecological groups and 30
occidental accessions. In this study, we addressed the
population genomics of apricots with an emphasis on
germplasm-exchange events, and we further elucidated
the role of this process in shaping the germplasm diversity
of the common apricot in China.
Results
Genome assembly and annotation
Prunus armeniaca (“Yinxiangbai”), a native diploid
cultivar from Lintong, Shanxi Province, North China, was
selected for whole-genome sequencing. We generated a
total of 45.73 Gb of raw data with a 350-bp insert-size
library and 47.52 Gb (PacBio) and 82.37 Gb (Nanopore) of
long reads (Supplementary Table 1). A 17-mer analysis
revealed that P. armeniaca “Yinxiangbai” possessed a
genome size of 264.4 Mb with a heterozygosity rate of
0.99% (Supplementary Fig. 1). The integration of the short
and long reads yielded a final genome size of 251.19 Mb
and a contig N50 of 4.04 Mb (Supplementary Table 2),
both parameters were much larger than those from previously published reports (221.9 Mb with a contig N50 =
1.02 Mb)10. The assembled contigs were further anchored
to eight linkage groups using linkage maps (Supplementary Fig. 2). Further application of the Hi-C data yielded a
total length of 251.19 Mb (Supplementary Table 3), a
scaffold N50 of 30.98 Mb, and a contig-anchoring rate of
97.04%, thus representing the highest-quality reference
genome ever reported for the Prunus genus (Supplementary Fig. 3; Supplementary Table 3). The BUSCO
(Benchmarking Universal Single-Copy Orthologs)
assessment11 revealed that the assembled genome could
represent up to 96.20% of the complete P. armeniaca
“Yinxiangbai” genome (Supplementary Table 4).
Gene model prediction identified 29,230 protein-coding
genes (Supplementary Table 5), and this was comparable
to that of other Prun (...truncated)