The complex methylome of the human gastric pathogen Helicobacter pylori
Juliane Krebes
1
2
Richard D. Morgan
0
Boyke Bunk
1
4
Cathrin Spro er
1
4
Khai Luong
3
Raphael Parusel
2
Brian P. Anton
0
Christoph Ko nig
3
Christine Josenhans
1
2
Jo rg Overmann
1
4
Richard J. Roberts
0
Jonas Korlach
3
Sebastian Suerbaum
1
2
0
New England Biolabs, 240 County Road, Ipswich,
MA 01938, USA
1
German Center for Infection Research
, Hannover-Braunschweig Site, Carl- Neuberg-Strae 1, 30625 Hannover,
Germany
2
Institute of Medical Microbiology and Hospital Epidemiology, Hannover Medical School
, Carl-Neuberg-Strae 1, 30625 Hannover,
Germany
3
Pacific Biosciences
, 1380 Willow Road, Menlo Park,
CA 94025, USA
4
Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures
, Inhoffenstrae 7B, 38124 Braunschweig,
Germany
-
The genome of Helicobacter pylori is remarkable
for its large number of restriction-modification
(R-M) systems, and strain-specific diversity in R-M
systems has been suggested to limit natural
transformation, the major driving force of genetic
diversification in H. pylori. We have determined the
comprehensive methylomes of two H. pylori
strains at single base resolution, using Single
Molecule Real-Time (SMRT ) sequencing. For
strains 26695 and J99-R3, 17 and 22 methylated
sequence motifs were identified, respectively. For
most motifs, almost all sites occurring in the
genome were detected as methylated. Twelve
novel methylation patterns corresponding to nine
recognition sequences were detected (26695, 3;
J99-R3, 6). Functional inactivation, correction of
frameshifts as well as cloning and expression of
candidate methyltransferases (MTases) permitted
not only the functional characterization of multiple,
yet undescribed, MTases, but also revealed novel
features of both Type I and Type II R-M systems,
including frameshift-mediated changes of sequence
specificity and the interaction of one MTase with
two alternative specificity subunits resulting in
different methylation patterns. The methylomes of
these well-characterized H. pylori strains will
provide a valuable resource for future studies
investigating the role of H. pylori R-M systems in
limiting transformation as well as in gene regulation
and host interaction.
INTRODUCTION
The Gram-negative human pathogen, Helicobacter pylori,
chronically infects more than half of the world population.
Helicobacter pylori infection induces inflammation of the
gastric mucosa, which can give rise to sequelae, such as
peptic ulcer disease and gastric cancer (1). Helicobacter
pylori is the bacterial pathogen with the highest genetic
diversity and variability (24), which is believed to
contribute to lifelong persistence by enabling adaptation to its
host (2,3). In addition to a high mutation rate (5),
recombination between different H. pylori strains during mixed
infections with multiple strains within one stomach is the
major driving force of allelic diversification (68). The
naturally competent H. pylori differs from other bacteria by
integrating unusually short fragments of DNA into its
chromosome after natural transformation (9). The
reasons for the small sizes of imports are largely
unknown, but differences of the genomic content of
active restriction-modification (R-M) systems have been
suggested to limit recombination between H. pylori
strains (1012).
R-M systems are widely distributed among bacteria
and are found in >90% of the analyzed genomes (13).
Bacterial R-M systems were initially described as a
defence mechanism against bacteriophage infection
(14,15). They comprise two enzymatic activities: (i) a
methyltransferase (MTase) activity that catalyzes the
addition of a methyl group from the donor S-adenosyl
methionine (SAM) to adenine or cytosine, and (ii) a
restriction endonuclease (REase) activity that cleaves
internal phosphodiester bonds of the DNA backbone.
Both enzyme activities of the same system (cognate
enzymes) recognize the same specific nucleotide sequence
(recognition site), and methylation of the recognition site
prevents restriction. The three major groups of R-M
systems are classified as Type I, II and III, according to
their subunit composition, cofactor requirements,
structure of their recognition sequence and mode of action
[for detailed reviews see (16,17)]. Type I systems are the
most complex and form a heteropentamer (HsdR2M2S)
that exerts three functions: restriction (HsdR),
modification (HsdM) and specificity (HsdS). This complex works
both as REase and MTase, but HsdM2S alone is sufficient
for methylation. Sequence specificity of HsdR and HsdM
is achieved by HsdS, which is typically composed of two
target recognition domains (TRDs) mediating sequence
recognition on both DNA strands. The simplest systems
are the Type IIP R-M systems, which consist of two
separate polypeptides (REase, MTase) that act
independently of each other. Type III systems are also encoded by
two genes (mod and res). While the Mod subunit alone
achieves DNA modification, both subunits are required
for restriction. In contrast to typical Type II MTases,
which usually methylate 48 bp palindromic sites on
both DNA strands, Mod catalyzes hemi-methylation of
the DNA at 46 bp asymmetric recognition sites. More
recently, a fourth class of R-M systems has been added.
Type IV systems are encoded by one or two genes that
represent methyl-dependent REases (18).
Adenine and cytosine are the only bases known to be
enzymatically methylated. In bacteria, three types of
methylation, N6-methyladenine (m6A), N4-methylcytosine
(m4C) and 5-methylcytosine (m5C) have been detected.
While Type I and Type III R-M systems only methylate
adenine, all three types of methylation have been reported
to be catalyzed by Type II MTases (17).
Helicobacter pylori genomes encode an unusually high
number of R-M systems (13,1921). The two first H. pylori
strains whose genomes were sequenced are 26695 (19) and
J99 (21), and their strongly different complements of R-M
systems have been analyzed in some detail. The two strains
have been proposed to encode members of all four types
of R-M systems. While several studies have addressed the
activity of Type II MTases (2224), only one Type III
MTase of H. pylori 26695 has been functionally
characterized so far (25). Apart from that, Type I and
Type III R-M systems of H. pylori were mostly
uncharacterized and their specificity unknown. An
overview about known and predicted R-M genes for
many H. pylori strains can be found in the REBASE
database [http://rebase.neb.com/rebase/rebase.html, (13)].
It has recently been reported that DNA methylation can
be reliably detected at single-base resolution by Single
Molecule Real-Time (SMRT ) sequencing technology,
which enables the genome-wide detection of m6A, m4C
and m5C methylation (26,27). In this next-generation
sequencing technology, genome sequencing is achieved
by monitoring the action of an engineered phi29-based
DNA polymerase, which catalyzes the incorporation of
fluorescently labeled nucleotides. Besides the primary
sequence, (...truncated)