Comprehensive analysis of DNA polymerase III α subunits and their homologs in bacterial genomes
Published online 7 October 2013
Nucleic Acids Research, 2014, Vol. 42, No. 3 1393–1413
doi:10.1093/nar/gkt900
SURVEY AND SUMMARY
Comprehensive analysis of DNA polymerase III
a subunits and their homologs in bacterial genomes
Ke˛stutis Timinskas, Monika Balvočiūtė, Albertas Timinskas and Česlovas Venclovas*
Institute of Biotechnology, Vilnius University, Graičiūno 8, Vilnius LT-02241, Lithuania
Received July 31, 2013; Revised September 12, 2013; Accepted September 13, 2013
ABSTRACT
INTRODUCTION
DNA polymerase III is a tripartite protein machine
responsible for replication of bacterial genome (1–5).
*To whom correspondence should be addressed. Tel: +370 5 269 1881; Fax: +370 5 260 2116; Email:
Present address:
_ , Institut für Mathematik und Informatik, Ernst Moritz Arndt Universität Greifswald, Germany.
Monika Balvočiute
The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.
ß The Author(s) 2013. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
The analysis of 2000 bacterial genomes revealed
that they all, without a single exception, encode one
or more DNA polymerase III a-subunit (PolIIIa)
homologs. Classified into C-family of DNA polymerases they come in two major forms, PolC and DnaE,
related by ancient duplication. While PolC represents an evolutionary compact group, DnaE can be
further subdivided into at least three groups (DnaE13). We performed an extensive analysis of various
sequence, structure and surface properties of all
four polymerase groups. Our analysis suggests a
specific evolutionary pathway leading to PolC and
DnaE from the last common ancestor and reveals
important differences between extant polymerase
groups. Among them, DnaE1 and PolC show the
highest conservation of the analyzed properties.
DnaE3 polymerases apparently represent an
‘impaired’ version of DnaE1. Nonessential DnaE2
polymerases, typical for oxygen-using bacteria
with large GC-rich genomes, have a number of
features in common with DnaE3 polymerases. The
analysis of polymerase distribution in genomes
revealed three major combinations: DnaE1 either
alone or accompanied by one or more DnaE2s,
PolC+DnaE3 and PolC+DnaE1. The first two
combinations are present in Escherichia coli and
Bacillus subtilis, respectively. The third one
(PolC+DnaE1), found in Clostridia, represents a
novel, so far experimentally uncharacterized, set.
It consists of a DNA polymerase, its processivity factor
b-clamp and a clamp loader complex. The actual DNA
synthesis is performed by the polymerase III a-subunit
(PolIIIa), classified into the C-family of DNA polymerases (6). Surprisingly, bacterial PolIIIa subunits are both
structurally and evolutionary distinct from eukaryotic and
archaeal replicative DNA polymerases (7,8) that belong to
the B-family. Instead, the PolIIIa catalytic domain is distantly related to the X-family of DNA polymerases (7,8),
exemplified by eukaryotic Polb, a polymerase acting in
DNA excision repair (9,10). It should be noted that this
unexpected relationship could not be detected by protein
sequence comparison and only became apparent in the
context of 3D structures (7,8). Although polymerases of
C and X families are not globally similar, a strong case for
their common evolutionary origin could be made based on
the observation that they share a common fold of corresponding ‘palm’ domains and bind DNA in the same
manner (11). In contrast, ‘palm’ domains of DNA polymerases belonging to A, B and Y families have entirely
different fold. Taken together, these findings lend additional support for the hypothesis that bacterial replicative
polymerases (C-family) on one hand and archaeal/eukaryotic replicative polymerases (B-family) on the other hand
have evolved as components of two independent DNA
replication systems (12). Another interesting observation
is that C-family polymerases are essentially confined to the
bacteria kingdom. Only a handful of PolIIIa homologs
have been detected in bacteriophages, which predominantly use B-family (and to lesser extent A-family) DNA
polymerases (13,14). One of the explanations for the
scarcity of PolIIIa homologs even in bacteria-infecting
viruses is that the C-family is evolutionary ‘young’
compared with the B-family (13). Owing to their relatively
late emergence, C-family DNA polymerases might have
failed to make a significant imprint in the B-family–
dominated viral landscape (13), and a few instances of
1394 Nucleic Acids Research, 2014, Vol. 42, No. 3
If the number of distinct PolIIIa subunits and their
role in a bacterial cell are considered, there also are
notable differences. The widely studied E. coli encodes a
sole DnaE-type PolIIIa subunit, which performs DNA
synthesis of both leading and lagging strands (1,19,20).
However, this is not a universal situation in the bacterial
world. For example, low-GC Gram-positive bacteria were
found to have both PolC and DnaE (17). Experiments
with B. subtilis and some other Gram-positive bacteria
showed that both types of PolIIIa subunits are essential
(21–23). Initially, it was thought that PolC and DnaE are
leading and lagging strand polymerases, respectively (21).
However, more recently, in vitro experiments with the
reconstituted B. subtilis replisome (24) revealed a different
picture of their division of labor. It turned out that DnaE
makes an initial extension of the RNA primer on both
strands and then PolC takes over for rapid synthesis of
long stretches of DNA (24). In this regard, B. subtilis
DnaE is reminiscent of eukaryotic Pol a, which extends
the RNA primer and then makes way for a processive
replicase (25). Some bacteria have a second copy of
DnaE, usually referred to as DnaE2. So far, genetic
studies targeting dnaE2, all without a single exception,
identified it as a nonessential gene (26–32), indicating
that DnaE2 is not required for chromosomal DNA replication. Instead, DnaE2 has been associated with DNA
damage-inducible error-prone translesion DNA synthesis
(TLS) (26–28,31,32). In genomes, dnaE2 is typically found
as part of LexA-regulated contiguous or split multigene
cassette, which includes two other genes, imuA/imuA’ and
imuB (27,33,34). The two genes encode catalytically
Figure 1. Structural organization of DnaE and PolC forms of C-family DNA polymerases. Crystal structures of T. aquaticus DnaE (left, PDB ID:
3E0D) and G. kaustophilus PolC (right, PDB ID: 3F2B) complexes with the DNA displayed in same orientation. Protein structures are shown as
solvent accessible surfaces with different structural modules shown in different colors. The missing NTD and the exonuclease domain (Exo) in PolC
structure are represented correspondingly by a pair of elli (...truncated)