Mutational analyses of dinucleotide and tetranucleotide microsatellites in Escherichia coli: influence of sequence on expansion mutagenesis
Kristin A. Eckert
0
1
Guang Yan
0
1
0
Hershey,
PA 17033, USA
1
The Jake Gittlen Cancer Research Institute, The Pennsylvania State University College of Medicine
, PO Box 850
Mutagenesis at [GT/CA]10, [TC/AG]11 and [TTCC/ AAGG]9 microsatellite sequences inserted in the herpes simplex virus thymidine kinase (HSV-tk) gene was analyzed in isogenic mutL+ and mutLEscherichia coli. In both strains, significantly more expansion than deletion mutations were observed at the [TTCC/AAGG]9 motif relative to either dinucleotide motif. As the HSV-tk coding sequence contains an endogenous [G/C]7 mononucleotide repeat and ~1000 bp of unique sequence, we were able to compare mutagenesis among various sequence motifs. We observed that the relative risk of mutation in E.coli is: [TTCC/AAGG]9 > [GT/CA]10 ~ [TC/AG]11 > unique ~ [G/C]7. The mutation frequency varied 1400fold in mutL+ cells between the tetranucleotide motif and the mononucleotide motif, but only 50-fold in mutL- cells. The [G/C]7 sequence was destabilized the greatest and the tetranucleotide motif the least by loss of mismatch repair. These results demonstrate that the quantitative risk of mutation at various microsatellites greatly depends on the DNA sequence composition. We suggest alternative models for the production of expansion mutations during lagging strand replication of the [TTCC/AAGG]9 microsatellite.
-
Microsatellite sequences of 14 or 5 nt per repeat unit are
ubiquitous throughout the human genome (14). These repetitive
sequences can be found flanking coding sequences and within
introns, as transcribed but untranslated genomic regions
(1,5,6). Unfortunately, our knowledge as to the exact number,
sequence composition and genomic location of microsatellites
is biased by the large proportion of cDNA sequences that
constitute the current genomic sequence databases (2,5,6), and
a full appreciation of this class of repetitive sequence must
await completion of the various genome projects. Nevertheless,
evidence exists supporting a role for [GT/CA]n and [TC/AG]n
sequences in the regulation of gene expression (79) and in
modulating chromatin structure (10). Moreover, within the
past decade, a direct involvement of microsatellite sequences
in human disease has been demonstrated (11).
Microsatellite sequences influence the local geometry of
DNA due to the potential for adopting non-B DNA conformations
(10). Repeats of alternating purinepyrimidine bases (e.g., GT/
CA) can form Z-DNA, and repeats of polypurine and
polypyrimidine tracts (e.g., TC/AG) can form triplex DNA. In
addition, particular trinucleotide sequences (e.g., CGG/GCC)
have the potential to form stable hairpin structures (12). The
effect of non-B DNA forms on DNA metabolism, including
replicative, repair and recombination processes, has not been
studied rigorously. The ability of long microsatellite sequences
that are capable of forming both triplex and hairpin structures
to arrest DNA synthesis in vitro (13,14) forms the basis of
current models for preferential genetic expansion of these
alleles in vivo (15,16).
A large base of knowledge exists in several model systems,
including Escherichia coli, yeast and human cells, regarding
the genetic stability of mono-, di- and trinucleotide
microsatellite sequences (1621). The favored mechanism to explain
alterations in microsatellite allele size is slipped strand
mispairing between repeat units during replicative or repair
DNA synthesis (22,23). Consistent with this model, loss of
DNA polymerase proofreading activity or post-replication
mismatch repair (MMR) greatly enhances the rate of [A]n and
[GT/CA]n microsatellite tract alterations (1618,21,24).
However, little is known about the genetic factors controlling
tetranucleotide sequence stability. In yeast, the mutation rate
for a [CAGT/GTCA]n allele was similar to that of a [GT/CA]n
allele, and was increased in an MMR-deficient strain (25). In
human cell lines, direct measurements of microsatellite alleles
have yielded estimated mutation rates for [GATA/CTAT]n
sequences that are significantly higher than rates for [GT/CA]n
sequences (26), and the mutation rate for an [AAAG/TTTC]n
microsatellite is one of the highest measured for microsatellites
in human cells (27). Nevertheless, mathematical modeling of
mutation rates at various microsatellites in the genome
databases has failed to show a significant difference in mutability
between di- and tetranucleotide sequences (28,29).
In this study, we compared the stability of the di- and
tetranucleotide sequences [TC/AG]n and [TTCC/AAGG]n in
MMR-proficient and deficient E.coli strains. Our strategy
quantitated the mutability of the microsatellite sequences
relative to coding sequences within the same genetic target, the
herpes simplex virus type 1 thymidine kinase (HSV-tk) gene.
We observed a significantly greater incidence of expansion
mutations at [TC/AG]n and [TTCC/AAGG]n alleles, relative to a
[GT/CA]n allele. The frequency of mutation at the tetranucleotide
locus was up to 40-fold higher than the mutation frequencies at
both dinucleotide loci, and MMR affected tetranucleotide
stability to only a minor extent.
MATERIALS AND METHODS
Escherichia coli strains
Strain FT334 is a derivative of HB101 (30) with the following
genotype: tdk, upp, thi1, hsd20, supE44, lacY1, proA2, ara14,
galK2, xyl5, mtl1, leuB6, rpsL20, recA13. Strain PP102 is
isogenic to strain FT334, except for the following alleles:
recA306 srl::Tn10, mutL::Tn5 (P.Prince and R.Monnat,
University of Washington, personal communication)
Oligonucleotides used to construct the microsatellite
sequences were synthesized by Biosynthesis, Inc (Lewisville,
TX) or the Macromolecular Core Facility, Penn State College
of Medicine (Hershey, PA). All restriction endonucleases were
supplied by Gibco BRL Life Technologies (Gaithersburg, MD)
and used according to manufacturers instructions.
5-Fluoro-2deoxyuridine (FUdR) and chloramphenicol were purchased
from Sigma Chemical Co. (St Louis, MO).
Construction of artificial-microsatellite-containing vectors
All artificial microsatellite sequences were inserted in-frame
between bases 111 and 112 of the HSV-tk gene, in the
sequence context [GT (insert) TCTC]. In the unidirectional
vectors described, the first sequence listed serves at the
template for the leading strand of replication, and the second
sequence serves as the template of the lagging strand.
Construction of [GT/CA]10 and [TC/AG]11
microsatellitecontaining plasmids has been described (20), and the same
method was used to construct the [TTCC/AAGG]3 vector. The
[TTCC/AAGG]9 and [TC/AG]18 microsatellite inserts were
synthesized by an in vitro DNA polymerase reaction. A 111
base oligonucleotide, corresponding to the HSV-tk sense
strand (nucleotides 73147) and containing the microsatellite
sequence to be inserted, was primed by hybridization of a
15mer oligonucleotide at a 1:1 molar ratio. This substrate was
used as a DNA template for native T7 DNA polymerase in a (...truncated)