The genome of Rhizobium leguminosarum has recognizable core and accessory components
Genome Biology
e2YVt0oau0lulns6.mgeea7r,cIshsue 4, Article R34 Re The genome of Rhizobium leguminosarum has recognizable core and accessory components
J Peter W Young 2
Lisa C Crossman 0
Andrew WB Johnston 1
Nicholas R Thomson 0
Zara F Ghazoui 2
Katherine H Hull 2
Margaret Wexler 1
Andrew RJ Curson 1
Jonathan D Todd 1
Philip S Poole 3
Tim H Mauchline 3
Alison K East 3
Michael A Quail 0
Carol Churcher 0
Claire Arrowsmith 0
Inna Cherevach 0
Tracey Chillingworth 0
Kay Clarke 0
Ann Cronin 0
Paul Davis 0
Audrey Fraser 0
Zahra Hance 0
Heidi Hauser 0
Kay Jagels 0
Sharon Moule 0
Karen Mungall 0
Halina Norbertczak 0
Ester Rabbinowitsch 0
Mandy Sanders 0
Mark Simmonds 0
Sally Whitehead 0
Julian Parkhill 0
0 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus , Cambridge , UK
1 School of Biological Sciences, University of East Anglia , Norwich , UK
2 Department of Biology, University of York , York , UK
3 School of Biological Sciences, University of Reading , Reading , UK
Background: Rhizobium leguminosarum is an -proteobacterial N2-fixing symbiont of legumes that has been the subject of more than a thousand publications. Genes for the symbiotic interaction with plants are well studied, but the adaptations that allow survival and growth in the soil environment are poorly understood. We have sequenced the genome of R. leguminosarum biovar viciae strain 3841. Results: The 7.75 Mb genome comprises a circular chromosome and six circular plasmids, with 61% G+C overall. All three rRNA operons and 52 tRNA genes are on the chromosome; essential protein-encoding genes are largely chromosomal, but most functional classes occur on plasmids as well. Of the 7,263 protein-encoding genes, 2,056 had orthologs in each of three related genomes (Agrobacterium tumefaciens, Sinorhizobium meliloti, and Mesorhizobium loti), and these genes were overrepresented in the chromosome and had above average G+C. Most supported the rRNA-based phylogeny, confirming A. tumefaciens to be the closest among these relatives, but 347 genes were incompatible with this phylogeny; these were scattered throughout the genome but were over-represented on the plasmids. An unexpectedly large number of genes were shared by all three rhizobia but were missing from A. tumefaciens. Conclusion: Overall, the genome can be considered to have two main components: a 'core', which is higher in G+C, is mostly chromosomal, is shared with related organisms, and has a consistent phylogeny; and an 'accessory' component, which is sporadic in distribution, lower in G+C, and located on the plasmids and chromosomal islands. The accessory genome has a different nucleotide composition from the core despite a long history of coexistence.
-
Background
The symbiosis between legumes and N2-fixing bacteria
(rhizobia) is of huge agronomic benefit, allowing many crops
to be grown without N fertilizer. It is a sophisticated example
of coupled development between bacteria and higher plants,
culminating in the organogenesis of root nodules [1]. There
have been many genetic analyses of rhizobia, notably of
Sinorhizobium meliloti (the symbiont of alfalfa),
Bradyrhizobium japonicum (soybean), and Rhizobium leguminosarum,
which has biovars that nodulate peas and broad beans (biovar
viciae), clovers (biovar trifolii), or kidney beans (biovar
phaseoli).
The Rhizobiales, an -proteobacterial order that also includes
mammalian pathogens Bartonella and Brucella and
phytopathogenic Agrobacterium, have diverse genomic
architectures. The single chromosome of Bartonella is small (1.6-1.9
Mb [2]), but the larger (approximately 3.3 Mb) Brucella
genomes comprise two circles [3-5]. Genomes of the
plantassociated bacteria are larger still; that of A. tumefaciens is
about 5.6 Mb, with one circular and one linear chromosome,
plus two native plasmids [6,7]. To date, three rhizobial
genomes have been sequenced. S. meliloti 1021 has a 3.5 Mb
chromosome plus two megaplasmids, namely pSymA (1.35
Mb) and pSymB (1.68 Mb), with the former having genes for
nodulation (nod) and symbiotic N2 fixation (nif and fix) [8].
In contrast, the symbiosis genes of Mesorhizobium loti
MAFF303099 (which nodulates Lotus) and of B. japonicum
USDA110 are on chromosomal 'symbiosis islands', with the
chromosome of the latter (9.1 Mb) being among the largest
yet known in bacteria [9,10].
Rhizobium leguminosarum has yet another genomic
architecture: one circular chromosome and several large plasmids,
the plasmid portfolio varying markedly among isolates in
terms of sizes, numbers, and incompatibility groups [11-14].
The subject of the present study, R. leguminosarum biovar
viciae (Rlv) strain 3841 (a spontaneous
streptomycin-resistant mutant of field isolate 300 [15,16]), has six large
plasmids; pRL10 is the pSym (symbiosis plasmid) and pRL7 and
pRL8 are transferable by conjugation [17].
The distinction between 'chromosome' and 'plasmid' has
become blurred in recent years with the discovery that many
bacteria have more than one replicon with over a million base
pairs. For example, the second replicon of Brucella melitensis
16M is called a chromosome (1.18 Mb) [3], whereas the
equivalent in S. meliloti 1021 is referred to as a megaplasmid
(pSymB; 1.68 Mb) [8]. They both replicate using the repABC
system as is typical of plasmids, and both carry the only
copies of certain essential genes, although the B. melitensis
chromosome II has many more of these as well as a complete
ribosomal RNA operon. What combination of size, replication
system, rRNA genes, and essentiality should qualify a
replicon to be called a chromosome is probably more a matter of
semantics than of biology.
A more important distinction, in our view, is between 'core'
and 'accessory' genomes. This distinction predates the
genomics era; indeed, it has been discussed for more than a
quarter of a century. Davey and Reanney [18] contrasted
'universal' and 'peripheral' genes, or 'conserved' and
'experimental' DNA. Campbell [19] wrote of 'euchromosomal' and
'accessory' DNA and explained how gene transfer was
important in shaping the latter. He pointed out that genes carried by
plasmids or transposons were 'available to all cells of the
species, though not actually present in them' and 'should
typically be genes that are needed occasionally rather than
continually under natural conditions'. Furthermore, the need
to function in different genetic backgrounds meant that
'evolution must limit the development of specific interactions
between their products and those of universal genes'. This
would tend to sharpen the separation between the
euchromosomal and accessory gene pools, although transfer between
them would remain possible.
The expectation is that particular accessory genes will often
be absent from closely related strains or species, and as
comparative data became available such genes were indeed found
in large numbers [20]. They often had a nucleotide
composition different from the bulk of the genome, and this property
ha (...truncated)