Extreme Reconfiguration of Plastid Genomes in the Angiosperm Family Geraniaceae: Rearrangements, Repeats, and Codon Usage

Molecular Biology and Evolution, Jan 2011

Geraniaceae plastid genomes (plastomes) have experienced a remarkable number of genomic changes. The plastomes of Erodium texanum, Geranium palmatum, and Monsonia speciosa were sequenced and compared with other rosids and the previously published Pelargonium hortorum plastome. Geraniaceae plastomes were found to be highly variable in size, gene content and order, repetitive DNA, and codon usage. Several unique plastome rearrangements include the disruption of two highly conserved operons (S10 and rps2-atpA), and the inverted repeat (IR) region in M. speciosa does not contain all genes in the ribosomal RNA operon. The sequence of M. speciosa is unusually small (128,787 bp); among angiosperm plastomes sequenced to date, only those of nonphotosynthetic species and those that have lost one IR copy are smaller. In contrast, the plastome of P. hortorum is the largest, at 217,942 bp. These genomes have experienced numerous gene and intron losses and partial and complete gene duplications. Some of the losses are shared throughout the family (e.g., trnT-GGU and the introns of rps16 and rpl16); however, other losses are homoplasious (e.g., trnG-UCC intron in G. palmatum and M. speciosa). IR length is also highly variable. The IR in P. hortorum was previously shown to be greatly expanded to 76 kb, and the IR is lost in E. texanum and reduced in G. palmatum (11 kb) and M. speciosa (7 kb). Geraniaceae plastomes contain a high frequency of large repeats (>100 bp) relative to other rosids. Within each plastome, repeats are often located at rearrangement end points and many repeats shared among the four Geraniaceae flank rearrangement end points. GC content is elevated in the genomes and also in coding regions relative to other rosids. Codon usage per amino acid and GC content at third position sites are significantly different for Geraniaceae protein-coding sequences relative to other rosids. Our findings suggest that relaxed selection and/or mutational biases lead to increased GC content, and this in turn altered codon usage. We propose that increases in genomic rearrangements, repetitive DNA, nucleotide substitutions, and GC content may be caused by relaxed selection resulting from improper DNA repair.

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/mbe/article-pdf/28/1/583/13649058/msq229.pdf

Extreme Reconfiguration of Plastid Genomes in the Angiosperm Family Geraniaceae: Rearrangements, Repeats, and Codon Usage

Extreme Reconfiguration of Plastid Genomes in the Angiosperm Family Geraniaceae: Rearrangements, Repeats, and Codon Usage Mary M. Guisinger,*,1,2 Jennifer V. Kuehl,3 Jeffrey L. Boore,3,4,5 and Robert K. Jansen1 1 Section of Integrative Biology, University of Texas, Austin Department of Plant Microbial Biology, University of California, Berkeley 3 DOE Joint Genome Institute and Lawrence Berkeley National Laboratory, Walnut Creek, California 4 Genome Project Solutions, Hercules, California 5 Department of Integrative Biology, University of California, Berkeley *Corresponding author: E-mail: . Associate editor: Charles Delwiche 2 Geraniaceae plastid genomes (plastomes) have experienced a remarkable number of genomic changes. The plastomes of Erodium texanum, Geranium palmatum, and Monsonia speciosa were sequenced and compared with other rosids and the previously published Pelargonium hortorum plastome. Geraniaceae plastomes were found to be highly variable in size, gene content and order, repetitive DNA, and codon usage. Several unique plastome rearrangements include the disruption of two highly conserved operons (S10 and rps2-atpA), and the inverted repeat (IR) region in M. speciosa does not contain all genes in the ribosomal RNA operon. The sequence of M. speciosa is unusually small (128,787 bp); among angiosperm plastomes sequenced to date, only those of nonphotosynthetic species and those that have lost one IR copy are smaller. In contrast, the plastome of P. hortorum is the largest, at 217,942 bp. These genomes have experienced numerous gene and intron losses and partial and complete gene duplications. Some of the losses are shared throughout the family (e.g., trnTGGU and the introns of rps16 and rpl16); however, other losses are homoplasious (e.g., trnG-UCC intron in G. palmatum and M. speciosa). IR length is also highly variable. The IR in P. hortorum was previously shown to be greatly expanded to 76 kb, and the IR is lost in E. texanum and reduced in G. palmatum (11 kb) and M. speciosa (7 kb). Geraniaceae plastomes contain a high frequency of large repeats (.100 bp) relative to other rosids. Within each plastome, repeats are often located at rearrangement end points and many repeats shared among the four Geraniaceae flank rearrangement end points. GC content is elevated in the genomes and also in coding regions relative to other rosids. Codon usage per amino acid and GC content at third position sites are significantly different for Geraniaceae protein-coding sequences relative to other rosids. Our findings suggest that relaxed selection and/or mutational biases lead to increased GC content, and this in turn altered codon usage. We propose that increases in genomic rearrangements, repetitive DNA, nucleotide substitutions, and GC content may be caused by relaxed selection resulting from improper DNA repair. Key words: plastid genomics, molecular evolution, Geraniaceae, Erodium, Geranium, Monsonia. Introduction Comparisons among the approximate 130 land plant plastid genomes (plastomes) available on GenBank show that genome size, gene content, gene order, and rates of sequence evolution are generally conserved. Most have a quadripartite structure with two copies of a large inverted repeat (IR) separating two unequally sized single-copy regions, termed the large and small single-copy regions. Land plant plastomes generally range in size from 108 to 165 kb and usually contain 110–130 distinct genes (reviewed in Raubeson and Jansen 2005; Bock 2007). The majority of these genes (about 80) code for proteins and are mostly involved in photosynthesis or gene expression with the remainder being transfer RNA (tRNA) (about 30) or ribosomal RNA (rRNA) (4) genes. GC content is also highly conserved in the plastomes of land plants and is typically in the range of 30–40%, with GC content being lower in noncoding intergenic regions than in coding regions (reviewed in Bock 2007). The strong AT bias is reflected in codon usage, where an A or T is preferred in the third position of synonymous codons (Shimada and Sugiura 1991). Raubeson et al. (2007) examined GC content and codon bias in two early diverging land plants, Nuphar and Ranunculus, and found that GC content could not explain codon usage patterns. Raubeson et al. (2007) suggested that an error checking bias of the plastid DNA polymerase and/or efficiency for DNA denaturation during replication or transcription likely affect GC content in plastomes. On the other hand, strong evidence from nematode nuclear genomes shows that GC content influences both codon usage and amino acid composition and that GC content is probably driven by directional mutation pressure (Mitreva et al. 2006). © The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: Mol. Biol. Evol. 28(1):583–600. 2011 doi:10.1093/molbev/msq229 Advance Access publication August 30, 2010 583 Downloaded fromarticle https://academic.oup.com/mbe/article/28/1/583/984367 by guest on 07 June 2024 Research Abstract MBE Guisinger et al. · doi:10.1093/molbev/msq229 584 plastid-encoded NADH dehydrogenase (ndh) genes was suggested for Erodium chrysanthum (Guisinger et al. 2008). The unusual features exhibited by Geraniaceae organellar genomes make this an ideal family to study plastome evolution. Aside from data gathered from restriction site mapping studies (Palmer et al. 1987; Price et al. 1990) and from the complete plastome sequence of P. hortorum (Chumley et al. 2006), relatively little is known about the extent of genomic change throughout the Geraniaceae. The goals of the current study were to 1) characterize plastomes from the other major lineages in the family, 2) compare and contrast genome size and gene content in Geraniaceae plastomes relative to each other and to other representative rosids, 3) examine the extent of repetitive DNA in Geraniaceae plastomes with an emphasis on the role that repeats might play in genome rearrangement, and 4) characterize codon and tRNA use in Geraniaceae plastomes relative to other rosids. The last goal is particularly relevant given the loss of the tRNA gene trnT-GGU in P. hortorum (Palmer et al. 1987; Chumley et al. 2006). Materials and Methods Taxon Sampling and Sample Preparation Based on previous phylogenies of the family (Price and Palmer 1993; Parkinson et al. 2005; Guisinger et al. 2008), taxa were chosen from each additional major lineage in Geraniaceae. The family is comprised of approximately 800 species and 5 genera, namely, Erodium, the monotypic genus California (formerly in Erodium), Geranium, Monsonia (circumscribed with Sarcocaulon; Albers 1996), and Pelargonium. The sequence of P. hortorum was previously published (Chumley et al. 2006). Plant material from Erodium texanum, Geranium palmatum, and Monsonia speciosa was used, and protocols for plastid isolations are previously described (Jansen (...truncated)


This is a preview of a remote PDF: https://academic.oup.com/mbe/article-pdf/28/1/583/13649058/msq229.pdf
Article home page: https://academic.oup.com/mbe/article/28/1/583/984367

Guisinger, Mary M., Kuehl, Jennifer V., Boore, Jeffrey L., Jansen, Robert K.. Extreme Reconfiguration of Plastid Genomes in the Angiosperm Family Geraniaceae: Rearrangements, Repeats, and Codon Usage, Molecular Biology and Evolution, 2011, pp. 583-600, Volume 28, Issue 1, DOI: 10.1093/molbev/msq229