Evolutionary relationships among group II intron-encoded proteins and identification of a conserved domain that may be related to maturase function
Nucleic Acids Research, 1993, Vol. 21, No. 22
4991-4997
Evolutionary relationships among group II intron-encoded
proteins and identification of a conserved domain that may
be related to maturase function
Georg Mohr, Philip S.Perlman1 and Alan M.Lambowitz*
Departments of Molecular Genetics, Biochemistry, and Medical Biochemistry, and the Biotechnology
Center, The Ohio State University, 484 West 12th Avenue, Columbus, OH 43210 and 1 Department
of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75235-9038, USA
Received July 16, 1993; Revised and Accepted August 23, 1993
Many group II introns encode reverse transcriptase-like
proteins that potentially function in intron mobility and
RNA splicing. We compared 34 intron-encoded open
reading frames and four related open reading frames
that are not encoded in introns. Many of these open
reading frames have a reverse transcriptase-like
domain, followed by an additional conserved domain
X, and a Zn2+-finger-like region. Some open reading
frames have lost conserved sequence blocks or key
amino acids characteristic of functional reverse
transcriptases, and some lack the Zn2+-finger-like
region. The open reading frames encoded by the
chloroplast tRNA1-*8 genes and the related Epifagus
virginiana matK open reading frame lack a
Zn2+-finger-like region and have only remnants of a
reverse transcriptase-like domain, but retain a readily
identifiable domain X. Several findings lead us to
speculate that domain X may function in binding of the
intron RNA during reverse transcription and RNA
splicing. Overall, our findings are consistent with the
hypothesis that all of the known group II intron open
reading frames evolved from an ancestral open reading
frame, which contained reverse transcriptase, X, and
Zn 2+ -finger-like domains, and that the reverse
transcriptase and Zn2+-finger-like domains were lost
in some cases. The retention of domain X in most
proteins may reflect an essential function in RNA
splicing, which is independent of the reverse
transcriptase activity of these proteins.
INTRODUCTION
Group II introns are of interest both because of their RNAcatalyzed splicing mechanism, which resembles that of nuclear
pre-mRNA introns, and because they behave as mobile elements
(1,2). Group II introns have been found in fungal and plant
mitochondria and in chloroplasts (1), and recently, in the
proteobacterium Azotobacter vinelandii and the cyanobacterium
Calothrix, which are related to the probable ancestors of
mitochondria and chloroplasts, respectively (3). All group II
introns have a conserved secondary structure, which consists of
six double helical domains radiating from a central wheel, with
the two different structural classes, HA and UB, distinguished
by specific features (1). The conserved RNA structure catalyzes
splicing via formation of a lariat intermediate similar to that
formed during the splicing of nuclear pre-mRNA introns (4-6).
Although some group II introns self-splice in vitro, they require
proteins for efficient splicing in vivo, presumably to help fold
the intron RNA into the catalytically-active structure. Some of
these proteins are encoded by chromosomal genes, whereas
others, 'maturases', are encoded by the introns themselves (7).
The mobility of group II introns has been inferred from their
location and distribution in different genes (8) and from the
existence of 'twintrons' in which one group II intron has
integrated into either another group II intron or into a degenerate
type of group II intron, referred to as a group HI intron (9,10).
The first direct evidence for group II intron mobility came from
studies showing that two Saccharomyces cerevisiae (yeast) group
II introns {coxl intron 1 and coxl intron 2) insert efficiently during
crosses into coxl alleles lacking these introns (11,12). Similar
findings have also been reported for Kluyveromyces lactis coxl
intron 1, a cognate of yeast coxl intron 2 (13). All three of these
mobile group II introns contain a long open reading frame (ORF),
which encodes a reverse transcriptase (RT)-like protein. The
proteins encoded by the yeast coxl introns 1 and 2 have been
shown to be bifunctional: they have an RT activity that may play
a role in intron mobility (14), and they also function as maturases
in splicing the intron in which they are encoded (15-17).
In addition to yeast coxl introns 1 and 2, a number of other
group II introns contain ORFs that may encode proteins that
function in RNA splicing or intron mobility (8). Some of these
ORFs are located in the loop of intron domain IV, whereas others
are in-frame with the upstream exon with the bulk of the ORF
in the loop of domain IV (1). The yeast coxl intron 2 protein,
* To whom correspondence should be addressed at: Department of Molecular Genetics, The Ohio State University, 484 W. Twelfth Avenue, Columbus,
OH 43210, USA
ABSTRACT
4992 Nucleic Acids Research, 1993, Vol. 21, No. 22
Pro
F
P
1li'H WA
HIV-1
Pol
F
P
T
•
RNase H
C
Int
•
\
1 i
Zn
S.c.
coxl 12
M-Pcoxl 12
I
I
f
I
IV
V VI VII
N.t.
trnK II
Figure 1. Comparison of protein domains in the HIV-1 pol gene and ORFs of group n introns S.cerevisiae (yeast) coxl intron 2, M. polymorpha coxl intron 2
and N. tabacum trnK intron 1. Protein domains are indicated by marked areas. [Pro, protease; (Pro), possible protease domain of intron ORFs; Z, domain Z; Pol
and RT, reverse transcriptase domain; X, domain X; Zn, Zn2+-finger-like region, Int, integrase]. Conserved sequence blocks I to VII characteristic of RTs are
indicated below the RT domains (18,20); parentheses indicate weak, but recognizable matches for the RT sequence blocks (see Fig. 2). Demarcated regions of the
HIV-1 protein indicate structural domains (P, palm; F, fingers; T, thumb; C, connection) identified in the X-ray crystallographic structure of the protein (32). Vertical
arrows indicate positions of yeast coxl intron 2 mutations, which are associated with the maturase defect in mutant C1082 (Ser648 and Asp675; refs. 14,17). Citations
for sequence data are given in Table 1.
shown schematically in Fig. 1, as well as most other group II
intron ORFs, have a readily identifiable RT domain. This domain
generally includes matches for the seven conserved sequence
blocks characteristic of RTs (denoted I to VII; refs. 18-20),
although several of the ORFs lack some of the conserved regions
or key amino acids characteristic of functional RTs (see below).
Phylogenetic comparisons indicate that the RT domains of group
II intron-encoded proteins belong to a class characteristic of the
LINE 1-like or non-long-terminal repeat (non-LTR) family of
retroelements. Within this class, the group II intron-encoded
proteins comprise a separate subgroup, which is most closely
related to the RTs of the Neurospora Mauriceville mitochondrial
plasmid and bacterial retrons (19—21).
In addition to the RT-like domain, the group II intron ORFs
also contain a conserved, upstream domain, Z, (...truncated)