Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries

BMC Genomics, Nov 2010

Background Although melon (Cucumis melo L.) is an economically important fruit crop, no genome-wide sequence information is openly available at the current time. We therefore sequenced BAC-ends representing a total of 33,024 clones, half of them from a previously described melon BAC library generated with restriction endonucleases and the remainder from a new random-shear BAC library. Results We generated a total of 47,140 high-quality BAC-end sequences (BES), 91.7% of which were paired-BES. Both libraries were assembled independently and then cross-assembled to obtain a final set of 33,372 non-redundant, high-quality sequences. These were grouped into 6,411 contigs (4.5 Mb) and 26,961 non-assembled BES (14.4 Mb), representing ~4.2% of the melon genome. The sequences were used to screen genomic databases, identifying 7,198 simple sequence repeats (corresponding to one microsatellite every 2.6 kb) and 2,484 additional repeats of which 95.9% represented transposable elements. The sequences were also used to screen expressed sequence tag (EST) databases, revealing 11,372 BES that were homologous to ESTs. This suggests that ~30% of the melon genome consists of coding DNA. We observed regions of microsynteny between melon paired-BES and six other dicotyledonous plant genomes. Conclusion The analysis of nearly 50,000 BES from two complementary genomic libraries covered ~4.2% of the melon genome, providing insight into properties such as microsatellite and transposable element distribution, and the percentage of coding DNA. The observed synteny between melon paired-BES and six other plant genomes showed that useful comparative genomic data can be derived through large scale BAC-end sequencing by anchoring a small proportion of the melon genome to other sequenced genomes.

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2164-11-618.pdf

Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries

BMC Genomics Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries Vctor M Gonzlez 3 Luis Rodrguez-Moreno 0 2 Emilio Centeno 3 Andrej Benjak 1 Jordi Garcia-Mas 1 Pere Puigdomnech 3 Miguel A Aranda 0 2 0 Departamento de Biologia del Estres y Patologia Vegetal, Centro de Edafologia y Biologia Aplicada del Segura (CEBAS) - CSIC, Apdo. correos 164, 30100 Espinardo (Murcia) , Spain 1 IRTA, Center for Research in Agricultural Genomics CRAG (CSIC-IRTA-UAB) , Carretera de Cabrils Km 2, 08348 (Barcelona) , Spain 2 Departamento de Biologia del Estres y Patologia Vegetal, Centro de Edafologia y Biologia Aplicada del Segura (CEBAS) - CSIC, Apdo. correos 164, 30100 Espinardo (Murcia) , Spain 3 Molecular Genetics Department, Center for Research in Agricultural Genomics CRAG (CSIC-IRTA-UAB) , Jordi Girona, 18-26, 08034 Barcelona , Spain Background: Although melon (Cucumis melo L.) is an economically important fruit crop, no genome-wide sequence information is openly available at the current time. We therefore sequenced BAC-ends representing a total of 33,024 clones, half of them from a previously described melon BAC library generated with restriction endonucleases and the remainder from a new random-shear BAC library. Results: We generated a total of 47,140 high-quality BAC-end sequences (BES), 91.7% of which were paired-BES. Both libraries were assembled independently and then cross-assembled to obtain a final set of 33,372 nonredundant, high-quality sequences. These were grouped into 6,411 contigs (4.5 Mb) and 26,961 non-assembled BES (14.4 Mb), representing ~4.2% of the melon genome. The sequences were used to screen genomic databases, identifying 7,198 simple sequence repeats (corresponding to one microsatellite every 2.6 kb) and 2,484 additional repeats of which 95.9% represented transposable elements. The sequences were also used to screen expressed sequence tag (EST) databases, revealing 11,372 BES that were homologous to ESTs. This suggests that ~30% of the melon genome consists of coding DNA. We observed regions of microsynteny between melon paired-BES and six other dicotyledonous plant genomes. Conclusion: The analysis of nearly 50,000 BES from two complementary genomic libraries covered ~4.2% of the melon genome, providing insight into properties such as microsatellite and transposable element distribution, and the percentage of coding DNA. The observed synteny between melon paired-BES and six other plant genomes showed that useful comparative genomic data can be derived through large scale BAC-end sequencing by anchoring a small proportion of the melon genome to other sequenced genomes. - Background Melon (Cucumis melo L.) is an important horticultural crop grown in temperate, subtropical and tropical regions worldwide. More than 25 million tonnes of fruit were produced in 2007, 64.5% in Asia, 14.6% in Europe, 13.1% in America and 7.8% in Africa [1]. Melon belongs to the Cucurbitaceae family, which comprises 90 genera and ~750 species, including other fruit crops such as watermelon (Citrullus lanatus (Thunb.) Matsum & Nakai), cucumber (Cucumis sativus L.), squash and pumpkin (Cucurbita spp.). Genetically, melon is a diploid species (2 = 2n = 24) with an estimated genome size of 454 Mb [2]. Transgenic melons, first produced in 1990, can now be generated in a range of recalcitrant cultivars [3,4]. Melon fruits are morphologically and biochemically diverse, which makes them particularly suitable for research into the flavor and texture changes that occur during ripening [5]. Despite its economic importance, there are few genomic resources for melon. As of January 2010, 126,940 highquality expressed sequence tags (ESTs) and 23,762 unigenes were available in public databases [6,7], which is low when compared to the 298,123 ESTs available for tomato (Solanum lycopersicum L.) and the 1,249,110 ESTs available for rice (Oryza sativa L.) [8]. More recent efforts to increase the availability of genetic and genomic resources for melon [9] have included the construction of bacterial artificial chromosome (BAC) libraries [10,11], the development of oligo-based microarrays [12,13], the production of TILLING and EcoTILLING platforms [14,15] and the development of a collection of near isogenic lines (NILs) [16]. However, the integration of genetic and physical maps is a necessary first step towards sequencing the melon genome, identifying relevant genes using these to discover how economically important aspects of fruit development are controlled [17,18]. Over the last 15 years, several melon genetic maps have been constructed primarily using randomly amplified polymorphic DNAs (RAPDs), restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs) [19-24]. These maps have helped to pinpoint the loci of some important agronomic traits [25-27], but they are sparsely populated and the different markers make them difficult to compare. To address this issue, a genetic map has recently been constructed by merging several of those previous genetic maps [6]. In addition, a melon physical map representing 0.9 melon genomic equivalents has recently been constructed using both a BAC library and a genetic map previously developed in our laboratory [28]. The physical and genetic maps have been integrated by anchoring 175 genetic markers to the physical map, allowing contigs representing 12% of the melon genome to be anchored to known genetic loci. It is important to obtain an accurate, representative sample of the genome ahead of full genome sequencing and annotation, and the end-sequencing of large numbers of BAC clones is an efficient strategy to achieve this goal. BAC-end sequences (BES) generate accurate but inexpensive genome samples that give a first impression of properties such as GC content, the distribution of microsatellites and transposable elements, and the amount of coding DNA [29-31]. However, most BAC libraries are constructed by digesting DNA with one or more restriction endonucleases, which introduces a partial bias in coverage because the target sites are distributed in a non-random manner [32]. We therefore sequenced BAC-ends representing 33,024 clones, half from a previously described BAC library generated using restriction endonucleases, but the remainder from a freshly-prepared random shear BAC library to eliminate the possibility of bias. We obtained 47,140 high-quality BES, which were analyzed for GC content, microsatellites, repeat elements and coding regions. A total of 43,224 paired BES were mapped independently onto six sequenced genomes from other dicotyledonous plant species to identify regions of microsynteny. Results and discussion BAC libraries Two BAC libraries from the double-haploid melon line PIT92 were used for BAC-end sequencing (Table 1). Table 1 Genomic C. melo BAC libraries 1Non-empty BAC clones containing melon genomic DNA 2Based on an estimated haploid genom (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/1471-2164-11-618.pdf
Article home page: http://www.biomedcentral.com/1471-2164/11/618

Víctor M González, Luis Rodríguez-Moreno, Emilio Centeno, Andrej Benjak, Jordi Garcia-Mas, Pere Puigdomènech, Miguel A Aranda. Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries, BMC Genomics, 2010, pp. 618, 11, DOI: 10.1186/1471-2164-11-618