A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus
M Carmen Marques
1
2
Hugo Alonso-Cantabrana
0
2
Javier Forment
2
Raquel Arribas
2
Santiago Alamar
2
3
Vicente Conejero
2
Miguel A Perez- Amador
2
0
Current address : Synergia Bionostra S.L. Ronda de Poniente 4, 28760 Tres Cantos
,
Madrid
,
Spain
1
Current address : Centro de Investigacion Principe Felipe. Avenida Autopista del Saler 16
,
46012 Valencia
,
Spain
2
Instituto de Biologia Molecular y Celular de Plantas (IBMCP), Universidad Politecnica de Valencia and Consejo Superior de Investigaciones Cientificas (CSIC).
Avenida de los Naranjos s/n, 46022 Valencia
,
Spain
3
Current address : Instituto de Agroquimica y Tecnologia de Alimentos (IATA)
,
Consejo Superior de Investigaciones Cientificas (CSIC), Burjassot, Valencia
,
Spain
Background: Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results: We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion: The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species.
-
Background
Citrus is one of the most widespread fruit crops with great
economic and health value [1]. But citrus is also one of the
most difficult plants to improve through traditional
breeding approaches due to undesirable reproductive
traits and characteristics. These include degrees of sexual
sterility and incompatibility, nucellar embryony (asexual
seed production), extended juvenility, and large plant
size, which affect cultural practice in the orchard. To
overcome these drawbacks, new genomic approaches are
being developed, including generation of linkage maps,
markers, and EST collections, making possible physical
and genetic mapping in citrus. Furthermore, an
International Citrus Genomics Consortium (ICGC) has been
initiated to generate the full-genome sequence of sweet
orange (Citrus sinensis), as well as to sequence other citrus
species and varieties [1]. Prior to the establishment of the
ICGC, EST collections [2,3] have provided a first glimpse
of the citrus genome. Over the years, several different
groups have contributed to the generation of a total of
over 230,000 citrus sequences currently deposited at the
dbEST division of the GenBank. Among these, the Spanish
Citrus Genomic Project (CFGP) http://bio
info.ibmcp.upv.es/genomics/cfgpDB has made a
significant contribution producing 25 standard cDNA libraries,
an EST collection of 22,635 high-quality reads [4], and
generating sequence data for over 54,000 ESTs from a
normalized full-length cDNA library and 9 additional
standard libraries [5]. EST sequencing along with other gene
discovery methods, represent an important initial step
towards functional characterization of the genes in the
genome.
Many methods for the construction of cDNA libraries
have been developed in recent years. Conventional cDNA
library construction approaches, however, suffer from
several major shortcomings. First, the majority of cDNA
clones are not full-length, especially for mRNAs longer
than 2 kb. This loss of 5'-terminal sequences is typically
due to premature termination of reverse transcription or
blunt-end polishing of cDNA ends prior to subcloning. As
a result, cDNA 5' ends are significantly underrepresented
in cDNA libraries. Second, an adaptor-mediated cloning
process is still a common approach for cDNA library
construction, leading to up to 20% of undesirable ligation
byproducts (chimeras) and inserts of non-mRNA origin
(e.g., genomic DNA, mitochondrial DNA, ribosomal
RNA, or adaptor dimers) [6]. In recent years, the
annotation of genes has been greatly improved by the integration
of full-length cDNAs produced by the community [7-10].
The importance of isolating full-length cDNA clones relies
in the "value-added" features lacking in common ESTs.
Full-length cDNAs define the limits of the transcriptional
units and the coding region, and thus identify the
immediate upstream basal promoter and enable sequence
characterization of 5' and 3' untranslated regions (UTR).
Furthermore, they provide a record of transcript diversity
due to modifications of the primary pre-mRNA transcript,
such as alternate promoter usage, alternative splicing,
alternate polyadenylation, and RNA editing. On the other
hand, cDNA libraries rich in full-length clones are a
valuable tool for high-throughput gene function analysis [11].
A number of methods have been developed for cDNA
library preparations enriched in full-length sequences
[1218] with most of them based on the mRNA cap structure
[12,14,15]. These methods require high quantities of
starting material (20-100 g of RNA) and complicated,
multi-step manipulations of the cap structure of mRNA
and cDNA intermediates, which often result i (...truncated)