Genomic selection in sugar beet breeding populations
Würschum et al. BMC Genetics 2013, 14:85
http://www.biomedcentral.com/1471-2156/14/85
RESEARCH ARTICLE
Open Access
Genomic selection in sugar beet breeding
populations
Tobias Würschum1*, Jochen C Reif1,3, Thomas Kraft2, Geert Janssen2,4 and Yusheng Zhao1,3
Abstract
Background: Genomic selection exploits dense genome-wide marker data to predict breeding values. In this study
we used a large sugar beet population of 924 lines representing different germplasm types present in breeding
populations: unselected segregating families and diverse lines from more advanced stages of selection. All lines
have been intensively phenotyped in multi-location field trials for six agronomically important traits and genotyped
with 677 SNP markers.
Results: We used ridge regression best linear unbiased prediction in combination with fivefold cross-validation and
obtained high prediction accuracies for all except one trait. In addition, we investigated whether a calibration
developed based on a training population composed of diverse lines is suited to predict the phenotypic
performance within families. Our results show that the prediction accuracy is lower than that obtained within the
diverse set of lines, but comparable to that obtained by cross-validation within the respective families.
Conclusions: The results presented in this study suggest that a training population derived from intensively
phenotyped and genotyped diverse lines from a breeding program does hold potential to build up robust
calibration models for genomic selection. Taken together, our results indicate that genomic selection is a valuable
tool and can thus complement the genomics toolbox in sugar beet breeding.
Background
Genomic selection has been suggested as a novel approach
to increase selection gain in crop and livestock breeding
programs [1-3]. Whereas QTL mapping strategies are
based on the assumption that individual chromosomal regions can be identified that contribute to the trait and
whose effects are estimated, genomic selection uses
genome-wide marker data to estimate genomic breeding
values of individuals. For plant breeding, genomic selection has been evaluated using empirical data from different crops, including maize e.g., [4-11], barley e.g., [12-14],
wheat e.g., [5,15-17], as well as sugar beet [18].
Ridge regression best linear unbiased prediction (RRBLUP) [1,19] has been shown to provide high prediction
accuracies across a range of crops and traits [14]. RRBLUP assumes that each marker contributes to the trait
and has the same variance which is in accordance with
the infinitisemal model of quantitative genetics and
* Correspondence:
1
State Plant Breeding Institute, University of Hohenheim, 70593 Stuttgart,
Germany
Full list of author information is available at the end of the article
explains why RR-BLUP provides good results for complex
traits [20]. Genomic selection is based on linkage disequilibrium between markers and QTL affecting the trait. In
addition, Habier et al. [21] showed that the accuracy of
genomic selection depends on the exploitation of genetic
relationships between individuals. RR-BLUP was most efficient in exploiting these genetic relationships since all
available markers are used in the model. Plants within
breeding programs will always show a certain degree of relatedness and in addition, most important agronomic
traits are complex traits. This suggests that RR-BLUP
should be well suited for genomic selection in applied
plant breeding. Another major advantage of RR-BLUP is
that it is computationally less demanding than other
approaches.
In genomic selection marker effects are first estimated
based on a set of individuals which have been phenotyped
and genotyped. This is often referred to as the training
population. In a second step, the breeding values of individuals that have been genotyped but not phenotyped are
predicted. It has been shown that the prediction accuracy
decreases when the genetic relatedness between the
© 2013 Würschum et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly cited.
Würschum et al. BMC Genetics 2013, 14:85
http://www.biomedcentral.com/1471-2156/14/85
individuals in the training population and those in the prediction set decreases [21] and that high accuracies require
that genotypes from the populations in which prediction
will be done are represented in the training population
[22]. In applied plant breeding programs different germplasm types are available: large biparental families from
early generations which have not been selected yet and
which are tested less intensively, and diverse lines from late
generations that remained after several rounds of selection
[23]. The latter are tested most intensively in field trials
and are often also genotyped to characterize them at the
molecular level. A key question for an efficient and costeffective implementation of genomic selection in breeding
programs is therefore whether a calibration model developed based on a training population consisting of a diverse
set of lines can be used for prediction of the phenotypic
performance within segregating families.
In this study we employed a large sugar beet population
consisting of a panel of diverse lines and four segregating
families to evaluate the potential of genomic selection for
different yield- as well as quality-related traits in sugar
beet and to investigate the prediction accuracy of genomic
selection within families using a training population composed of a diverse set of lines.
Results
The population under study is composed of a total of 924
lines which can be divided into two subpopulations: 248
lines are derived from four biparental families that are
connected by one common parent and 676 lines form a
diversity set with different degrees of relatedness (Figure 1).
All six traits showed significant genotypic variance estimates (P < 0.01) and medium to high heritabilities in the
entire population (0.38 to 0.71) and in the diversity set
(0.51 to 0.70) while across the four families the heritabilities ranged from 0.24 to 0.76 (Additional file 1: Table S1).
Page 2 of 8
In single families the heritabilities ranged between 0.02 to
0.60. The Box-Whisker-Plots indicate significant differences among the four families for all traits (Figure 2). Consequently, the data set presents a good basis to evaluate
the prospects of genomic selection in applied sugar beet
breeding.
We used fivefold cross-validation to assess the accuracy of genomic predictions for the six traits in the entire
population and in the diversity set (Figure 3). We found
that the cross-validated prediction accuracy was high for
all traits except for α-amino nitrogen content in the
entire population which showed only a moderate pr (...truncated)