Massive functional mapping of a 5′-UTR by saturation mutagenesis, phenotypic sorting and deep sequencing
Erik Holmqvist
0
Johan Reimega rd
0
E. Gerhart H. Wagner
0
0
Department of Cell and Molecular Biology, Biomedical Center, Uppsala University
, SciLifeLab Uppsala, Box 596, S-75124 Uppsala,
Sweden
We present here a method that enables functional screening of large number of mutations in a single experiment through the combination of random mutagenesis, phenotypic cell sorting and highthroughput sequencing. As a test case, we studied post-transcriptional gene regulation of the bacterial csgD messenger RNA, which is regulated by a small RNA (sRNA). A 109 bp sequence within the csgD 50UTR, containing all elements for expression and sRNA-dependent control, was mutagenized close to saturation. We monitored expression from a translational gfp fusion and collected fractions of cells with distinct expression levels by fluorescenceactivated cell sorting. Deep sequencing of mutant plasmids from cells in different activity-sorted fractions identified functionally important positions in the messenger RNA that impact on intrinsic (translational activity per se) and extrinsic (sRNA-based) gene regulation. The results obtained corroborate previously published data. In addition to pinpointing nucleotide positions that change expression levels, our approach also reveals mutations that are silent in terms of gene expression and/or regulation. This method provides a simple and informative tool for studies of regulatory sequences in RNA, in particular addressing RNA structure-function relationships (e.g. sRNA-mediated control, riboswitch elements). However, slight protocol modifications also permit mapping of functional DNA elements and functionally important regions in proteins.
-
INTRODUCTION
Forward and reverse genetics methods are valuable tools
to link phenotypes to DNA sequences. Forward genetics
identifies nucleotide changes that cause a phenotypic
change, and reverse genetics identifies the phenotype
associated with a particular mutation. In reverse
genetics, site-directed mutagenesis can be used to, for
example, pinpoint nucleotides in DNA/RNA sequences
at which regulators bind and can assess RNA structure
function relationships [e.g. (1,2)]. Random mutagenesis by
polymerase chain reaction (PCR) under error-prone
conditions is a powerful method for creating large pools
of mutants (3). For instance, error-prone PCR followed
by fluorescence-activated cell sorting (FACS) analysis has
been used to generate variants of the green fluorescent
protein (GFP) with increased intensity and more efficient
folding (4). Related to the work presented here,
errorprone PCR followed by phenotypic screening has
identified base changes that affect expression and stability
of the small RNA (sRNA) MicA, as well as
MicA-dependent post-transcriptional regulation (5). Even though such
approaches have turned out to be successful in identifying
functionally important nucleotides, they are tedious and
suffer from low throughput, as each mutant needs to be
phenotypically assayed one by one.
To increase throughput in reverse genetics, several recent
articles have described methods for scoring effects on
gene expression from large numbers of sequence variants.
The RNA-ID method was designed to study cis-regulatory
RNA sequences in yeast; short random sequences were
inserted into an messenger RNA (mRNA), and effects
on translation efficiency were monitored by FACS of
fluorescent protein expression (6). Kudla et al. (7) used a library
of 154 synthetic variants of gfp to study gene expression
changes arising from synonymous mutations. Another
article reported on a multiple mutation-, FACS- and
high-throughput-sequencing method used for mapping
protein binding and its energetics in transcriptional
regulation (8). Two additional publications also describe similar
methods for transcriptional regulation (9,10) and use
mRNA abundance measurements as readout for gene
expression. However, changes in DNA sequences that
involve insertions (6) may introduce unwanted effects
arising from different sequence lengths of the analyzed
variants. Additionally, mRNA abundance does not always
accurately report on a mutations effect on gene expression,
as mRNA and protein levels are not always correlated, and
protein expression often is predominantly regulated at the
post-transcriptional level (9,11). For instance, studies of
several bacterial mRNAs showed that sRNA-mediated
regulation can give altered protein levels without
significantly affecting mRNA levels (12,13).
For functional screening and mapping of high numbers
of mutations in single experiments, we present here a
method that combines saturation mutagenesis, phenotypic
cell sorting and high-throughput sequencing. This method
is particularly powerful for studies of post-transcriptional
regulation, but it is easily adaptable for studies of
transcriptional regulation as well. Our method does not rely
on insertion of sequences but generates nucleotide
substitutions, eliminating the risk of unwanted effects throu (...truncated)