SNP Discovery by Illumina-Based Transcriptome Sequencing of the Olive and the Genetic Characterization of Turkish Olive Genotypes Revealed by AFLP, SSR and SNP Markers (pdf)

Article PDF cannot be displayed. You can download it here:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0073674&type=printable

SNP Discovery by Illumina-Based Transcriptome Sequencing of the Olive and the Genetic Characterization of Turkish Olive Genotypes Revealed by AFLP, SSR and SNP Markers

SSR and SNP Markers. PLoS ONE 8(9): e73674. doi:10.1371/journal.pone.0073674 SNP Discovery by Illumina-Based Transcriptome Sequencing of the Olive and the Genetic Characterization of Turkish Olive Genotypes Revealed by AFLP, SSR and SNP Markers Hilal Betul Kaya 0 Oznur Cetin 0 Hulya Kaya 0 Mustafa Sahin 0 Filiz Sefer 0 Abdullah Kahraman 0 Bahattin Tanyolac 0 Qiong Wu, Harbin Institute of Technology, China 0 1 Department of Bioengineering, Ege University , Izmir , Turkey , 2 Olive Research Station , Izmir , Turkey , 3 Department of Field Crops, Harran University , S. Urfa , Turkey Background: The olive tree (Olea europaea L.) is a diploid (2n = 2x = 46) outcrossing species mainly grown in the Mediterranean area, where it is the most important oil-producing crop. Because of its economic, cultural and ecological importance, various DNA markers have been used in the olive to characterize and elucidate homonyms, synonyms and unknown accessions. However, a comprehensive characterization and a full sequence of its transcriptome are unavailable, leading to the importance of an efficient large-scale single nucleotide polymorphism (SNP) discovery in olive. The objectives of this study were (1) to discover olive SNPs using next-generation sequencing and to identify SNP primers for cultivar identification and (2) to characterize 96 olive genotypes originating from different regions of Turkey. Methodology/Principal Findings: Next-generation sequencing technology was used with five distinct olive genotypes and generated cDNA, producing 126,542,413 reads using an Illumina Genome Analyzer IIx. Following quality and size trimming, the high-quality reads were assembled into 22,052 contigs with an average length of 1,321 bases and 45 singletons. The SNPs were filtered and 2,987 high-quality putative SNP primers were identified. The assembled sequences and singletons were subjected to BLAST similarity searches and annotated with a Gene Ontology identifier. To identify the 96 olive genotypes, these SNP primers were applied to the genotypes in combination with amplified fragment length polymorphism (AFLP) and simple sequence repeats (SSR) markers. Conclusions/Significance: This study marks the highest number of SNP markers discovered to date from olive genotypes using transcriptome sequencing. The developed SNP markers will provide a useful source for molecular genetic studies, such as genetic diversity and characterization, high density quantitative trait locus (QTL) analysis, association mapping and map-based gene cloning in the olive. High levels of genetic variation among Turkish olive genotypes revealed by SNPs, AFLPs and SSRs allowed us to characterize the Turkish olive genotype. - Funding: This manuscript was funded by Turkish Technical and Research Council with the project number of 108G096. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. The olive tree (Olea europaea L. subsp. europaea var. europaea, Oleaceae) is one of the most ancient and important Mediterranean long-lived fruit species [1]. It is a diploid (2n = 2x = 46) outcrossing species mainly grown in the Mediterranean basin with a very wide genetic patrimony [2]. This wide genetic patrimony is represented by more than 1200 cultivars [3]. Olive oil and table olives are very important components in the Mediterranean diet [4]. Several studies have emphasized the beneficial effects of table olives [4] and olive oil on human health [5].The leading olive-producing countries of the world are Spain, Italy, Greece and Morocco. According to statistics provided by the Food and Agriculture Organization (FAO), Turkey ranks as the fifth largest olive producer in the world, with production hovering approximately 1.415 million tons of fruit in 2010 [6]. The sequencing and analysis of transcriptomes has been considered an efficient approach for gene expression profiling, alternative splicing, SNP discovery, mapping and quantification of transcriptomes in plants, especially in species without a reference genome sequence [7,8]. The Sanger sequencing of ESTs used to be the most common approach for SNP discovery to obtain the expressed sequence tags (ESTs) information. Over the past 10 years, the sequencing of ESTs using traditional techniques were used in several important species [9]. However, Sanger sequencing requires expensive and time-consuming approaches, including cDNA library construction and the cloning of DNA fragments [10]. Alternatively, a transcriptome analysis based on nextgeneration sequencing (NGS) is more attractive in identifying a transcriptome sequence dataset for marker development and gene discovery due to its lower cost per base pair of DNA, short time requirement and lack of a subcloning process [11]. Nextgeneration transcriptome sequencing has created transcriptome databases in various plants without a sequenced genome, including chickpea [12], wheat [13], Eucalyptus pilularis [14], carrot [15], mangroves [16], strawberry [17] and chestnut [18]. Additionally, the discovery of SNP markers using NGS technologies permits the identification of thousands of markers from entire genomes or from cDNA [19], which can be used for genetic diversity analyses [20], association mapping [21,22], linkage mapping [23] and marker-assisted selection [24] studies. Various platforms utilizing NGS, such as the Roche 454 Genome Sequencer, the Illumina Genome Analyzer and the Life Technologies SOLiD System, can produce massive sequence outputs, making high-throughput DNA marker discovery feasible and cost-effective [25,26]. There are various advantages and limitations among the various NGS platforms, which vary in terms of sensitivity, accuracy, reproducibility and throughput. Among these platforms, Illumina sequencing technology, which generates large-scale reads (75150 bp) at low costs with very high sequencing coverage, has been especially useful for de novo transcriptome studies [2527]. A large number of accessions are currently available in oliveproducing countries, raising several problems for germplasm management and preservation [28]. The evaluation and identification of olive genetic resources is therefore crucial, especially estimating the genetic variation in the existing germplasm, particularly due to the high occurrence of mislabeling, synonyms and homonyms in the olive. Genetic identification is the first key step in breeding programs, and molecular markers are valuable tools for identifying and characterizing diverse genotypes [29]. Currently, with the large array of DNA molecular marker types available, DNA markers provide useful information in theoretical and applied research fields for olive breeding, such as the determination of genetic diversity, genetic relationships [30] and population structures among cultivated species and their wild relatives [31, (...truncated)