Origin and evolution of organisms as deduced from 5S ribosomal RNA sequences.
Origin and Evolution of Organisms as Deduced from
5s Ribosomal RNA Sequences’
Hiroshi Hori* and Syozo Osawat
*Department
of Molecular
Hiroshima University; and TLaboratory
o f Genetics, GEN-KEN,
Genetics,, Department
of Biology, Nagoya University
Introduction
A phylogenetic tree of most of the major groups of organisms has been constructed
from the 352 5s ribosomal RNA sequences now available. The tree suggests that
there are several major groups of eubacteria that diverged during the early stages
of their evolution. Metabacteria (=archaebacteria)
and eukaryotes separated after
the emergence of eubacteria. Among eukaryotes, red algae emerged first; and, later,
thraustochytrids
(a Proctista group), ascomycetes (yeast), green plants (green algae
and land plants), “yellow algae” (brown algae, diatoms, and chrysophyte
algae),
basidiomycetes (mushrooms and rusts), slime- and water molds, various protozoans,
and animals emerged, approximately
in that order. Three major types of photored algae (=Chlorophyll
a group), green plants (Chl.
synthetic eukaryotes-i.e.,
a+b group) and yellow algae (Chl. a+c)-are
remotely related to one another.
Other photosynthetic
unicellular protozoans-such
as Cyanophora
(Chl. a), Euglenophyta (Chl. a+b), Cryptophyta (Chl. a+c), and Dinophyta (Chl. a+c)-seem
to have separated shortly after the emergence of the yellow algae.
At present, the evolutionary
relationships
of the major groups of organisms are
quite obscure, and the present systems of classification
are mainly based on physiological and morphological
characters. Since the evolutionary changes of such characters
are very complicated and the rate of change is variable in different groups of organisms
or in different evolutionary
periods, not much confidence can be given to the systems.
A more useful approach to this problem is to use DNA or RNA sequences, because
the evolutionary
change of these molecules is roughly proportional
to evolutionary
time. The 5s ribosomal RNA (5s rRNA) sequence is particularly useful for establishing
the phylogenetic
relationship
of distantly related organisms (Kimura and Ohta 1973;
Hori 1975) because of its low substitution
rate (mean rfr SE 0.18 + 0.05 substitution/
nucleotide
site/lo’ years; Hori et al. [ 19771) and because of its basic similarity of
structure among all organisms, which makes it possible to align the sequences for the
construction
of a comprehensive
phylogenetic
tree.
The 5s rRNA phylogenetic
trees for many groups of organisms or organelles
have been reported, e.g., for eubacteria (Dekio et al. 1984; Vandenberghe
et al. 1985),
“the purple eubacterial group” (Lane et al. 1985), the eubacterial family Vibrionaceae
(MacDonell
and Colwell 1985; MacDonell
et al. 1986) Mycoplasmas
(Rogers et al.
1985), metabacteria
(Fox et al. 1982; Hori et al. 1982), green plants (Hori et al. 1985a),
Ascomycota (Chen et al. 1984), Basidiomycota
(Walker and Doolittle 1982; Huysmans
et al. 1983; Gottschalk and Blanz 1984; Walker 1984), protozoans (Kumazaki
et al.
1. Key words: 5s rRNA, simplified UPGMA trees, molecular phylogeny.
Address for correspondence and reprints: Dr. Hiroshi Hori, Department of Genetics, GEN-KEN,
Hiroshima University, Kasumi, Hiroshima, 734, Japan.
Mol. Biol. Evol. 4(5):445-472. 1987.
0 1987 by The University of Chicago. All rights reserved.
0737-4038/87/0405-0001$02.00
445
446
Hori and Osawa
1983a), Meso- and Metazoa (Ohama et al. 1984), and organelles (Hori et al. 1982;
Wolters and Erdmann 1984). However, a 5s rRNA tree for all groups of organisms
has not been constructed. In the present paper, we have employed the 352 sequences
of 5s rRNAs now available to construct a phylogenetic tree of a wide spectrum of
extant organisms, including organelles, by means of a simplified unweighted-pairgroup (UPG) method.
Material and Methods
Sequence Alignment of 5s rRNA
The 352 5s rRNA sequences from various organisms available as of January
1986 have been used in the present study. Representative organisms examined herein
are taxonomically summarized in table 1. The alignment of these sequences was obtained mainly by juxtaposing the 5s rRNA secondary structures as described elsewhere
(Hori et al. 1985b).
of Phylogenetic Trees
Construction
The evolutionary distance, Knuc, between two sequences was calculated by means
of the equation described by Kimura (1980). Knuc estimates the number of base
substitutions per nucleotide site that have occurred since the separation of the two
sequences.
Knuc = -( 1/2)log,[( 1 - 2P-
Q)( 1 - 2Q)‘12],
(1)
where P and Q are the fractions of nucleotide sites between two sequences showing
transition- and transversion-type differences, respectively. The SE of the Knuc, SEK,
was calculated by using Kimura’s (1980) equation. When a gap of length one was
paired with one nucleotide, it was counted as equal to one transversion-type substitution. Large deletions in 5s rRNA sequence -e.g., those found in the sequences of
Mycoplasma species -are
likely to be due to single rare events rather than to the
compound effect of several separate events. Therefore, a gap of two or more nucleotides
was counted as two differences in determining Q.
The G+C content of genomic DNA in eubacteria is diversified to a considerable
extent, ranging from 25% to 75%. Since the G+C content of 5s rRNA more or less
reflects the genomic G+C content in eubacteria, we introduced a parameter to cancel
such an effect that might influence the rate of nucleotide substitution in 5s rRNA
molecules. (In eukaryotes and metabacteria, the genomic G+C content does not correlate significantly with the G+C content of 5s rRNA.) To estimate the evolutionary
distance between sequences i and j, the following equation was adopted from Hori
and Osawa (1986).
Dnuc = (cJcj)Knuc,
(2)
where Knuc is the value from equation (1) and ci and Cj(ci ;5 Cj)are the G+C contents
of sequences i and j, respectively.
With use of the Knuc or Dnuc values, a phylogenetic tree was constructed by
means of a “simplified” method of the UPG method by using arithmetic averages
(Sneath and Sokal 1973). For the estimation of the SE of each branching point in the
tree, the variance of each branching point was calculated by means of the equation
described by Nei et al. (1985). This is given by
Evolution of Organisms from 5s rRNA Sequences
447
where dkl is the inter-cluster distance between the kth species in cluster A and the Zth
species in cluster B and Yand s are the numbers of species in clusters A and B, respectively; V and Cov are the variance and covariance, respectively. In the actual
computation, however, to avoid excessive computational time owing to the large
number of 5s rRNA sequences (352 in this case), (KS)~was conventionally kept < 16
by using representative sequences in each cluster and was used for tree construction
by means of the UPG method (=“simplified” UPG method).
Results and Discussion
Validity of Phylogenetic Trees Deduced from 5s rRNA Sequences
As mentio (...truncated)