A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes

Genome Biology, Jun 2009

Background Archaeal and bacterial genomes contain a number of genes of foreign origin that arose from recent horizontal gene transfer, but the role of integrative elements (IEs), such as viruses, plasmids, and transposable elements, in this process has not been extensively quantified. Moreover, it is not known whether IEs play an important role in the origin of ORFans (open reading frames without matches in current sequence databases), whose proportion remains stable despite the growing number of complete sequenced genomes. Results We have performed a large-scale survey of potential recently acquired IEs in 119 archaeal and bacterial genomes. We developed an accurate in silico Markov model-based strategy to identify clusters of genes that show atypical sequence composition (clusters of atypical genes or CAGs) and are thus likely to be recently integrated foreign elements, including IEs. Our method identified a high number of new CAGs. Probabilistic analysis of gene content indicates that 56% of these new CAGs are likely IEs, whereas only 7% likely originated via horizontal gene transfer from distant cellular sources. Thirty-four percent of CAGs remain unassigned, what may reflect a still poor sampling of IEs associated with bacterial and archaeal diversity. Moreover, our study contributes to the issue of the origin of ORFans, because 39% of these are found inside CAGs, many of which likely represent recently acquired IEs. Conclusions Our results strongly indicate that archaeal and bacterial genomes contain an impressive proportion of recently acquired foreign genes (including ORFans) coming from a still largely unexplored reservoir of IEs.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://genomebiology.com/content/pdf/gb-2009-10-6-r65.pdf

A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes

Genome Biology Re e2CVt0oa0rlults9.emezea1r0c,hIssue 6, Article R65 A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes Diego Cortez 0 Patrick Forterre 0 Simonetta Gribaldo 0 0 Address: Institut Pasteur, Département de Microbiologie, Unité de Biologie Moléculaire du Gène chez les Extrêmophiles , Rue du Dr Roux, 75 724 PARIS cedex 15 , France Background: Archaeal and bacterial genomes contain a number of genes of foreign origin that arose from recent horizontal gene transfer, but the role of integrative elements (IEs), such as viruses, plasmids, and transposable elements, in this process has not been extensively quantified. Moreover, it is not known whether IEs play an important role in the origin of ORFans (open reading frames without matches in current sequence databases), whose proportion remains stable despite the growing number of complete sequenced genomes. Results: We have performed a large-scale survey of potential recently acquired IEs in 119 archaeal and bacterial genomes. We developed an accurate in silico Markov model-based strategy to identify clusters of genes that show atypical sequence composition (clusters of atypical genes or CAGs) and are thus likely to be recently integrated foreign elements, including IEs. Our method identified a high number of new CAGs. Probabilistic analysis of gene content indicates that 56% of these new CAGs are likely IEs, whereas only 7% likely originated via horizontal gene transfer from distant cellular sources. Thirty-four percent of CAGs remain unassigned, what may reflect a still poor sampling of IEs associated with bacterial and archaeal diversity. Moreover, our study contributes to the issue of the origin of ORFans, because 39% of these are found inside CAGs, many of which likely represent recently acquired IEs. Conclusions: Our results strongly indicate that archaeal and bacterial genomes contain an impressive proportion of recently acquired foreign genes (including ORFans) coming from a still largely unexplored reservoir of IEs. Background Integrative elements (IEs) such as viruses and plasmids and their associated hitchhiking elements, transposons, integrons, and so on, mediate the movement of DNA within genomes and between genomes, and play a key role in the emergence of infectious diseases, antibiotic resistance, biotransformation of xenobiotics, and so on [ 1-3 ]. Traces of IE activity have been highlighted in many prokaryotic genomes, which carry different repertoires of inserted prophages, plasmids, transposons and/or genomic islands [ 4-7 ]. These few characterized IEs are most likely only a reflection of a more diverse and still unknown IE universe that shapes bacterial and archaeal genomes [8]. The importance of IEs in the origin of ORFans (open reading frames (ORFs) without matches in current sequence databases) [ 9 ] is still controversial. Indeed, the source of ORFans remains a major mystery of the post-genomic era since, contrary to previous expectations, their proportion remains stable despite the increasing number of complete genome sequences available [ 10 ]. It has been suggested that ORFans are either misannotated genes, rapidly evolving sequences, newly formed genes, or genes recently transferred from not yet sequenced cellular or viral genomes [ 10,11 ]. The possibility that ORFans originate from the integration of elements of viral origin is appealing since viral genomes themselves always contain a high proportion of ORFans [ 12,13 ]. Consistent with this hypothesis, Daubin and Ochman [ 14 ] noticed that ORFans from γ-Proteobacteria share several features with viral ORFans (for example, small size, AT-rich) and suggested that 'ORFans in the genomes of free-living microorganisms apparently derive from bacteriophages and occasionally become established by assuming roles in key cellular functions.' However, Yin and Fisher [ 10 ] recently reported that, on average, only 2.8% of all cellular ORFans have homologues in current viral sequence databases, raising doubts about the hypothesis of a viral origin of ORFans, and proposed that 'lateral transfer from viruses alone is unlikely to explain the origin of the majority of ORFans in the majority of prokaryotes and consequently, other, not necessarily exclusive, mechanisms are likely to better explain the origin of the increasing number of ORFans.' More recently, the same authors found that only 18% of viral ORFans (ORFs present in only one viral genome) have homologues in archaeal or bacterial genomes, and concluded that 'phage ORFans play a lesser role in horizontal gene transfer to prokaryotes' [ 12 ]. Several in silico methods based on composition have been conceived in the past few years to identify foreign genes that were recently acquired by cellular genomes, such as atypical G+C content, atypical codon usage, Markov model (MM)based approaches, and Bayesian model (BM)- (...truncated)


This is a preview of a remote PDF: http://genomebiology.com/content/pdf/gb-2009-10-6-r65.pdf

Diego Cortez, Patrick Forterre, Simonetta Gribaldo. A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes, Genome Biology, 2009, pp. R65, 10, DOI: 10.1186/gb-2009-10-6-r65