Patterns and Implications of Gene Gain and Loss in the Evolution of Prochlorococcus
Citation: Kettler CG, Martiny AC, Huang K, Zucker J, Coleman ML, et al. (
Patterns and Implications of Gene Gain and Loss in the Evolution of Prochlorococcus
Gregory C. Kettler 0 1 2
Adam C. Martiny 0 1 2
Katherine Huang 0 1 2
Jeremy Zucker 0 1 2
Maureen L. Coleman 0 1 2
Sebastien Rodrigue 0 1 2
Feng Chen 0 1 2
Alla Lapidus 0 1 2
Steven Ferriera 0 1 2
Justin Johnson 0 1 2
Claudia Steglich 0 1 2
George M. Church 0 1 2
Paul Richardson 0 1 2
Sallie W. Chisholm 0 1 2
0 Current address: Department of Earth System Science and Department of Ecology and Evolutionary Biology, University of California , Irvine, California , United States of America
1 Editor: David S. Guttman, University of Toronto , Canada
2 1 Department of Biology, Massachusetts Institute of Technology , Cambridge, Massachusetts , United States of America, 2 Department of Civil and Environmental Engineering, Massachusetts Institute of Technology , Cambridge, Massachusetts , United States of America, 3 Department of Genetics, Harvard Medical School , Boston , Massachusetts, United States of America, 4 Joint Genome Institute, United States Department of Energy, Walnut Creek, California, United States of America, 5 J. Craig Venter Institute, Rockville, Maryland, United States of America, 6 Department of Biology II/Experimental Bioinformatics, University Freiburg , Freiburg , Germany
Prochlorococcus is a marine cyanobacterium that numerically dominates the mid-latitude oceans and is the smallest known oxygenic phototroph. Numerous isolates from diverse areas of the world's oceans have been studied and shown to be physiologically and genetically distinct. All isolates described thus far can be assigned to either a tightly clustered high-light (HL)-adapted clade, or a more divergent low-light (LL)-adapted group. The 16S rRNA sequences of the entire Prochlorococcus group differ by at most 3%, and the four initially published genomes revealed patterns of genetic differentiation that help explain physiological differences among the isolates. Here we describe the genomes of eight newly sequenced isolates and combine them with the first four genomes for a comprehensive analysis of the core (shared by all isolates) and flexible genes of the Prochlorococcus group, and the patterns of loss and gain of the flexible genes over the course of evolution. There are 1,273 genes that represent the core shared by all 12 genomes. They are apparently sufficient, according to metabolic reconstruction, to encode a functional cell. We describe a phylogeny for all 12 isolates by subjecting their complete proteomes to three different phylogenetic analyses. For each non-core gene, we used a maximum parsimony method to estimate which ancestor likely first acquired or lost each gene. Many of the genetic differences among isolates, especially for genes involved in outer membrane synthesis and nutrient transport, are found within the same clade. Nevertheless, we identified some genes defining HL and LL ecotypes, and clades within these broad ecotypes, helping to demonstrate the basis of HL and LL adaptations in Prochlorococcus. Furthermore, our estimates of gene gain events allow us to identify highly variable genomic islands that are not apparent through simple pairwise comparisons. These results emphasize the functional roles, especially those connected to outer membrane synthesis and transport that dominate the flexible genome and set it apart from the core. Besides identifying islands and demonstrating their role throughout the history of Prochlorococcus, reconstruction of past gene gains and losses shows that much of the variability exists at the ''leaves of the tree,'' between the most closely related strains. Finally, the identification of core and flexible genes from this 12-genome comparison is largely consistent with the relative frequency of Prochlorococcus genes found in global ocean metagenomic databases, further closing the gap between our understanding of these organisms in the lab and the wild.
-
The oceans play a key role in global nutrient cycling and
climate regulation. The unicellular cyanobacterium
Prochlorococcus is an important contributor to these processes, as it
accounts for a significant fraction of primary productivity in
low- to mid-latitude oceans [1]. Prochlorococcus and its close
relative, Synechococcus [2], are distinguished by their
photosynthetic machinery: Prochlorococcus uses chlorophyll-binding
proteins instead of phycobilisomes for light harvesting and
divinyl instead of monovinyl chlorophyll pigments. Although
Prochlorococcus and Synechocococcus coexist throughout much of
the worlds oceans, Synechococcus extends into more polar
regions and is more abundant in nutrient-rich waters, while
Prochlorococcus dominates relatively warm, oligotrophic
regions and can be found at greater depths [3]. The
Prochlorococcusthe most abundant photosynthetic microbe living
in the vast, nutrient-poor areas of the oceanis a major contributor
to the global carbon cycle. Prochlorococcus is composed of closely
related, physiologically distinct lineages whose differences enable
the group as a whole to proliferate over a broad range of
environmental conditions. We compare the genomes of 12 strains
of Prochlorococcus representing its major lineages in order to
identify genetic differences affecting the ecology of different
lineages and their evolutionary origin. First, we identify the core
genome: the 1,273 genes shared among all strains. This core set of
genes encodes the essentials of a functional cell, enabling it to make
living matter out of sunlight and carbon dioxide. We then create a
genomic tree that maps the gain and loss of non-core genes in
individual strains, showing that a striking number of genes are
gained or lost even among the most closely related strains. We find
that lost and gained genes commonly cluster in highly variable
regions called genomic islands. The level of diversity among the
non-core genes, and the number of new genes added with each
new genome sequenced, suggest far more diversity to be
discovered.
Prochlorococcus group consists of two major ecotypes,
highlight (HL)-adapted and low-light (LL)-adapted, that are
genetically and physiologically distinct [4] and are distributed
differently in the water column [5,6]. Given their relatively
simple metabolism, well-characterized marine environment,
and global abundance, these marine cyanobacteria represent
an excellent system for understanding how genetic
differences translate to physiological and ecological variation in
natural populations.
The first marine cyanobacterial genome sequences
suggested progressive genome decay from Synechococcus to LL
Prochlorococcus to HL Prochlorococcus, characterized by a
reduction in genome size (from 2.4 to 1.7 Mb) and a drop
in G C content from ;59% to ;30% [79]. Notably, genes
involved in light acclimation and nutrient assimilation
appeared to have been sequentially lost, consistent with the
niche differentiati (...truncated)