Dual Modes of Natural Selection on Upstream Open Reading Frames

Molecular Biology and Evolution, Aug 2007

Upstream open reading frames (uORFs) are common features of eukaryotic genes, occurring in 10%–25% of 5′ leader sequences. Upstream ORFs that have been subjected to experimental analysis have been generally found to decrease translational efficiency of the downstream coding sequence. Previous investigations of uORFs in mammals and yeast have detected uORFs conserved over long evolutionary distances, prompting speculation about the nature and cause of the natural selection underlying such conservation. We have analyzed uORFs in the basidiomycetous fungal pathogen Cryptococcus neoformans to discern the properties of this purifying selection. We find that uORFs in the Cryptococcus species complex are conserved at twice the expected rate, and we report 122 uORFs that are conserved among all four sequenced Cryptococcus strains. A significantly greater proportion of uORF losses occur via direct mutation to the uORF start codon than expected. This observation suggests that mutational disruption of a uORF that leaves the start codon intact may be selectively disadvantageous, perhaps because of the risk of premature translation initiation. Accounting for this constrained mode of loss and comparing the relative conservation of uORFs between the 5′ leader and control sequences enables us to calculate that at least a third of uORFs may be conserved for their effects on translational efficiency. The remaining fraction may be conserved either by chance or as a result of selective pressure to prevent premature translation initiation from the uORF start codon. We find that the majority of conserved uORFs do not exhibit codon usage bias or conservation at the amino acid level, and therefore they do not likely encode bioactive peptides. Our analysis suggests that uORFs are an important and underappreciated mechanism of post-transcriptional gene regulation in eukaryotes.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://mbe.oxfordjournals.org/content/24/8/1744.full.pdf

Dual Modes of Natural Selection on Upstream Open Reading Frames

Daniel E. Neafsey 0 James E. Galagan 0 0 Microbial Analysis Group, Broad Institute of MIT and Harvard , Cambridge, Massachusetts Upstream open reading frames (uORFs) are common features of eukaryotic genes, occurring in 10%-25% of 5# leader sequences. Upstream ORFs that have been subjected to experimental analysis have been generally found to decrease translational efficiency of the downstream coding sequence. Previous investigations of uORFs in mammals and yeast have detected uORFs conserved over long evolutionary distances, prompting speculation about the nature and cause of the natural selection underlying such conservation. We have analyzed uORFs in the basidiomycetous fungal pathogen Cryptococcus neoformans to discern the properties of this purifying selection. We find that uORFs in the Cryptococcus species complex are conserved at twice the expected rate, and we report 122 uORFs that are conserved among all four sequenced Cryptococcus strains. A significantly greater proportion of uORF losses occur via direct mutation to the uORF start codon than expected. This observation suggests that mutational disruption of a uORF that leaves the start codon intact may be selectively disadvantageous, perhaps because of the risk of premature translation initiation. Accounting for this constrained mode of loss and comparing the relative conservation of uORFs between the 5# leader and control sequences enables us to calculate that at least a third of uORFs may be conserved for their effects on translational efficiency. The remaining fraction may be conserved either by chance or as a result of selective pressure to prevent premature translation initiation from the uORF start codon. We find that the majority of conserved uORFs do not exhibit codon usage bias or conservation at the amino acid level, and therefore they do not likely encode bioactive peptides. Our analysis suggests that uORFs are an important and underappreciated mechanism of post-transcriptional gene regulation in eukaryotes. Introduction Microarrays have given the biological community abundant genome-wide data on rates of DNA transcription. The relative ease with which microarray data can now be acquired should not obscure the fact that transcription is not synonymous with expression. Indeed, there is growing evidence of significant variation in mRNA transcript half-life (Wang et al. 2002) and translational efficiency among genes (Serikawa et al. 2003; MacKay et al. 2004). To make fullest use of transcriptional data, then, it is imperative to understand what factors may intercede at the translation stage to decouple levels of transcription and expression. Short open reading frames in the 5# leader sequence of genes called upstream open reading frames (uORFs) are known to affect the translational efficiency of many eukaryotic genes (Morris and Geballe 2000; Meijer and Thomas 2002; Vilela and McCarthy 2003). Upstream ORFs are common genomic features, with estimates of uORF incidence in mammalian genes ranging as high as 25% (Crowe, Wang, and Rothnagel 2006) and 10%22% of fungal genes (Galagan et al. 2005). Although some uORFs may augment expression by obscuring other cis-acting inhibitory elements (Geballe and Sachs 2000), most experimentally tested eukaryotic uORFs are translational repressors. Upstream ORFs have been shown to affect translational efficiency negatively through a variety of means, including ribosome-blocking by the encoded peptide, ribosome stalling at the uORF termination codon, induction of the nonsense-mediated decay (NMD) pathway, and failure of the ribosome to re-initiate at the genic translation start site after disengaging from the uORF (Gaba et al. 2001). Upstream ORFs that have been experimentally tested through cell-free translation assays or other means have been found to decrease the rate of translation up to 20-fold (Hinnebusch 2005), although some uORFs appear to have little impact, or a variable impact, on translation rates (e.g., Wang and Rothnagel 2004). In accordance with the scanning model of translation initiation (Kozak 1994), it has been suggested that some uORFs may be conserved to prevent deleterious premature translation initiation from upstream AUG (uAUG) triplets (Iacono, Mignone, and Pesole 2005; Lynch, Scofield, and Hong 2005; Lynch 2006). Premature translation initiation leading to genic read-through would, at best, add extraneous peptides to the N-terminus of the encoded protein if the uAUG were in the same reading frame as the genic ORF, and, at worst, it would create a frameshift-induced nonsense mutation and entirely eliminate translation of the genic ORF. In this latter case, even if the uORF decreases the translation rate of the adjacent genic sequence, the phenotypic effect may be less severe than premature translation initiation, which results in the ribosomes reading through the genic translation start site. This hypothesis is supported by the observation that uAUGs are significantly underrepres (...truncated)


This is a preview of a remote PDF: https://mbe.oxfordjournals.org/content/24/8/1744.full.pdf

Daniel E. Neafsey, James E. Galagan. Dual Modes of Natural Selection on Upstream Open Reading Frames, Molecular Biology and Evolution, 2007, pp. 1744-1751, 24/8, DOI: 10.1093/molbev/msm093