Construction of Core Collections Suitable for Association Mapping to Optimize Use of Mediterranean Olive (Olea europaea L.) Genetic Resources

PLOS ONE, Dec 2019

Phenotypic characterisation of germplasm collections is a decisive step towards association mapping analyses, but it is particularly expensive and tedious for woody perennial plant species. Characterisation could be more efficient if focused on a reasonably sized subset of accessions, or so-called core collection (CC), reflecting the geographic origin and variability of the germplasm. The questions that arise concern the sample size to use and genetic parameters that should be optimized in a core collection to make it suitable for association mapping. Here we investigated these questions in olive (Olea europaea L.), a perennial fruit species. By testing different sampling methods and sizes in a worldwide olive germplasm bank (OWGB Marrakech, Morocco) containing 502 unique genotypes characterized by nuclear and plastid loci, a two-step sampling method was proposed. The Shannon-Weaver diversity index was found to be the best criterion to be maximized in the first step using the Core Hunter program. A primary core collection of 50 entries (CC50) was defined that captured more than 80% of the diversity. This latter was subsequently used as a kernel with the Mstrat program to capture the remaining diversity. 200 core collections of 94 entries (CC94) were thus built for flexibility in the choice of varieties to be studied. Most entries of both core collections (CC50 and CC94) were revealed to be unrelated due to the low kinship coefficient, whereas a genetic structure spanning the eastern and western/central Mediterranean regions was noted. Linkage disequilibrium was observed in CC94 which was mainly explained by a genetic structure effect as noted for OWGB Marrakech. Since they reflect the geographic origin and diversity of olive germplasm and are of reasonable size, both core collections will be of major interest to develop long-term association studies and thus enhance genomic selection in olive species.

Construction of Core Collections Suitable for Association Mapping to Optimize Use of Mediterranean Olive (Olea europaea L.) Genetic Resources

et al. (2013) Construction of Core Collections Suitable for Association Mapping to Optimize Use of Mediterranean Olive (Olea europaea L.) Genetic Resources. PLoS ONE 8(5): e61265. doi:10.1371/journal.pone.0061265 Construction of Core Collections Suitable for Association Mapping to Optimize Use of Mediterranean Olive (Olea europaea L.) Genetic Resources Ahmed El Bakkali 0 Hicham Haouane 0 Abdelmajid Moukhli 0 Evelyne Costes 0 Patrick Van 0 Damme 0 Bouchaib Khadari 0 Randall P. Niedz, United States Department of Agriculture, United States of America 0 1 INRA, UMR Ame lioration Ge ne tique et Adaptation des Plantes (AGAP), Montpellier, France, 2 Montpellier SupAgro, UMR AGAP, Montpellier, France, 3 INRA Mekne`s, UR Ame lioration des Plantes et Conservation des Ressources Phytoge ne tiques, Mekne`s, Morocco, 4 Department of Plant Production, Ghent University , Ghent, Belgium, 5 INRA Marrakech , UR Ame lioration des Plantes, Marrakech, Morocco, 6 Institute of Tropics and Subtropics, Czech University of Life Sciences Prague , Prague , Czech Republic , 7 Conservatoire Botanique National Me diterrane en, UMR AGAP , Montpellier , France Phenotypic characterisation of germplasm collections is a decisive step towards association mapping analyses, but it is particularly expensive and tedious for woody perennial plant species. Characterisation could be more efficient if focused on a reasonably sized subset of accessions, or so-called core collection (CC), reflecting the geographic origin and variability of the germplasm. The questions that arise concern the sample size to use and genetic parameters that should be optimized in a core collection to make it suitable for association mapping. Here we investigated these questions in olive (Olea europaea L.), a perennial fruit species. By testing different sampling methods and sizes in a worldwide olive germplasm bank (OWGB Marrakech, Morocco) containing 502 unique genotypes characterized by nuclear and plastid loci, a two-step sampling method was proposed. The Shannon-Weaver diversity index was found to be the best criterion to be maximized in the first step using the CORE HUNTER program. A primary core collection of 50 entries (CC50) was defined that captured more than 80% of the diversity. This latter was subsequently used as a kernel with the MSTRAT program to capture the remaining diversity. 200 core collections of 94 entries (CC94) were thus built for flexibility in the choice of varieties to be studied. Most entries of both core collections (CC50 and CC94) were revealed to be unrelated due to the low kinship coefficient, whereas a genetic structure spanning the eastern and western/central Mediterranean regions was noted. Linkage disequilibrium was observed in CC94 which was mainly explained by a genetic structure effect as noted for OWGB Marrakech. Since they reflect the geographic origin and diversity of olive germplasm and are of reasonable size, both core collections will be of major interest to develop long-term association studies and thus enhance genomic selection in olive species. - Recent advances in genomic tools, including genome sequencing [1] and high-density single nucleotide polymorphism (SNP) genotyping [2], and statistical methods have enabled the development of new approaches for mapping of complex traits. The identification of causal genes underlying specific traits is a major goal in plant breeding, subsequently offering opportunities to develop genomic selection tools [34]. Association mapping (also known as linkage disequilibrium (LD)-based association mapping) [5] has been proposed to associate single DNA sequence changes with traits of interest using collections of unrelated individuals, as an alternative or complement to quantitative trait locus (QTL)-mapping (also known as family-based linkage mapping) [6]. Association mapping has been largely documented and successfully used to identify the genetic basis of many complex diseases in humans [7], and is now emerging in plants [89]. It has the advantage of being rapid and cost effective as many alleles may be assessed simultaneously, resulting in higher resolution mapping by the use of most recombination events that occur over time, while avoiding the need to expensively and tediously develop crossing populations, particularly for perennial and forest tree species [10]. The number of markers needed to map specific associations depends on the extent and distribution of LD within the species and among linkage groups [5]. Many studies have thus proposed an estimate of LD in different plant species as a preliminary step for association analysis [1114]. Association mapping results obtained in a number of annual species, e.g. Arabidopsis thaliana [1516], Oryza sativa [1718], Triticum aestivum [19] and Zea mays [2021], indicate that the approach is promising to identify markers correlated with desirable traits such as flowering time [1516,20], seed morphology [19,22] and disease resistance [15,2324]. However, for woody and perennial species, studies have been performed on a limited number of species, such as Pinus taeda L. [25], Eucalyptus spp. [26] and Prunus persica [27]. Beyond the importance of ex situ conservation of genetic resources to avoid genetic erosion and provide plant breeders with easy access to study ranges of variation in phenotypic traits, germplasm collections could serve as a reservoir of outstanding genes to enhance agronomic traits so as to meet the needs of diverse agricultural systems. However, field evaluation and use of large germplasm collections for association mapping purposes are mostly constrained by problems of accession redundancy, economic cost and time, especially for clonally propagated perennial species where clones have to be maintained and evaluated for several years at different sites. Genetic resource assessments could thus be more rational if focused on a subset of accessions, or so-called core collection (CC; also known as core subset), which includes in the sample as much variability present in the whole collection as possible with minimal size [28]. Determining the best sample size to use and genetic criteria to be optimized for association mapping in one core collection is an open issue requiring further investigation, especially for perennial species. Over the last decade, several core subsets have been proposed for both annual species, e.g. Arabidopsis thaliania [29], Oryza sativa [30], Triticum aestivum [31] and Zea mays [32], and perennial species, e.g. Annona cherimola [33], Malus domestica [34], Prunus armeniaca [35] and Vitis vinifera [36], using different ecogeographical, agro-morphological, biochemical or molecular data. Despite the many approaches used to design core collections that optimize the genetic distance between accessions and/or the allelic diversity [3744], most of core collections have been constructed based on the so-called maximizing method (M-method) [37] through the MSTRAT program [40] by (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0061265&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0061265

Ahmed El Bakkali, Hicham Haouane, Abdelmajid Moukhli, Evelyne Costes, Patrick Van Damme, Bouchaib Khadari. Construction of Core Collections Suitable for Association Mapping to Optimize Use of Mediterranean Olive (Olea europaea L.) Genetic Resources, PLOS ONE, 2013, Volume 8, Issue 5, DOI: 10.1371/journal.pone.0061265