Long-Term Conservation of Six Duplicated Structural Genes in Cephalopod Mitochondrial Genomes
Laboratory for Cellular Biochemistry, Department of Molecular Biology, School of Life Science, Tokyo University of Pharmacy and Life Science
The complete nucleotide sequences of the mitochondrial (mt) genomes of three cephalopods, Octopus vulgaris (Octopodiformes, Octopoda, Incirrata), Todarodes pacificus (Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Decapodiformes, Oegopsida, Enoploteuthidae), were determined. These three mt genomes encode the standard set of metazoan mt genes. However, W. scintillans and T. pacificus mt genomes share duplications of the longest noncoding region, three cytochrome oxidase subunit genes and two ATP synthase subunit genes, and the tRNAAsp gene. Southern hybridization analysis of the W. scintillans mt genome shows that this single genome carries both duplicated regions. The near-identical sequence of the duplicates suggests that there are certain concerted evolutionary mechanisms, at least in cephalopod mitochondria. Molecular phylogenetic analyses of mt protein genes are suggestive, although not statistically significantly so, of a monophyletic relationship between W. scintillans and T. pacificus.
The types and number of genes encoded by metazoan
mitochondrial (mt) genomes are well conserved among
various metazoan species for which these data are
currently available (see Boore ). Most metazoan
mt genomes are circular, carrying single copies of 12
protein (cox1 to 3 [cytochrome oxidase subunits I to III],
nad1 to 6 and 4L [NADH dehydrogenase subunits 1 to 6
and 4L], atp6 and 8 [ATP synthase subunits 6 and 8], and
cob [apocytochrome b]), 2 rRNA (rrnL and rrnS [large and
small subunit ribosomal RNAs]), and 22 tRNA genes
(trnA, etc.). None or few intergenic nucleotides are found,
with the exception of long noncoding regions (NCR) that
contain control elements for replication and transcription.
However, gene organization varies among metazoan
species. For example, most molluscan mt genomes reported
so far have different gene organizations, and there are large
differences in mt gene organization within each class
(Hoffmann, Boore, and Brown 1992; Boore and Brown
1994; Hatzoglou, Rodakis, and Lecanidou 1995; Terrett,
Miles, and Thomas 1996; Yamazaki et al. 1997;
Kurabayashi and Ueshima 2000; Tomita et al. 2002; Grande et al.
2002, Wilding, Mill, and Grahame 1999). In pulmonate
land snails, variation in gene organization is found at the
level of the superfamily (Yamazaki et al. 1997).
Multiplication of NCRs has been observed in various
metazoan mt genomes (e.g., Kumazawa et al. 1998). In
some cases, the duplication of coding regions has also been
reported. For example, the oyster Crassostrea gigas mt
genome (GenBank/EMBL/DDBJ accession number
AF177226) carries two copies of rrnS. The nematode
Romanomermis culicivorax has also been reported to carry
multiple copies of protein genes (Azevedo and Hyman
1993; Hyman and Azevedo 1996; Hyman, Beck, and Weiss
1988). In these cases, a high sequence similarity between
duplicated genes has been observed. On the other hand,
gene duplication resulting from one of the duplicates
becoming a pseudogene has also been reported. For
example, the partial duplication of the gecko Heteronotia
binoei mt genome has been reported, and one of the
duplicates appears to be a pseudogene (Zevering et al.
1991). Sasuga et al. (1999) identified a pseudogene of trnH
between nad4 and nad5 in the Loligo bleekeri mt genome,
where trnH is located in the Katharina tunicata mt genome
(Boore and Brown 1994). trnH was found to be at a different
position (Sasuga et al. 1999; Tomita et al. 2002). It has,
thus, been proposed that the metazoan mt genome is under
strong selective pressure for genome minimization.
Recently, Tomita et al. (2002) reported the first
complete cephalopod mt nucleotide sequence for the L.
bleekeri mt genome. The gene content of the L. bleekeri
mt genome is the same as that of the typical metazoan mt
genome. However, the arrangement of genes within the mt
genome is different from that of any other metazoan
reported to date. One of the most notable characteristics of
the L. bleekeri mt genome is that it contains three
nearidentical, 500-bp NCRs. These NCRs are not placed on the
genome in tandem; instead, their placement seems to be
closely related to the gene rearrangement that has taken
place in the L. bleekeri mt genome (Tomita et al. 2002). In
addition, concerted evolution could be considered as the
mechanism underlying the maintenance of high similarity
within these NCR sequences.
To further our understanding of the evolution of mt
genome structures in cephalopods, we determined the
complete mt genome nucleotide sequences of three cephalopods,
Octopus vulgaris (Coleoidea, Neocoleoidea,
Octopodiformes, Octopoda, Incirrata, Octopodidae), Todarodes
pacificus (Coleoidea, Neocoleoidea, Decapodiformes,
Oegopsida, Ommastrephidae), and Watasenia scintillans
(Coleoidea, Neocoleoidea, Decapodiformes, Oegopsida,
Enoploteuthidae). The genome structure of the O. vulgaris
mt genome is rather similar to that of the K. tunicata mt
genome. However, the T. pacificus and W. scintillans
mt genomes contain long and complicated duplications
of long NCRs and six structural genes. We conclude our
report with a discussion of the evolution of cephalopod mt
PCR Primer Sequences of Analysis of W. scintillans mt Genomes
PCR primers for amplification of cox1, cox3, and cob partial sequences
Moll/cox1/5/1 59-ATAATTGGWGGWTTTGGWAAYTG-39 (fragment cox1)
Moll/cox1/3/1 59-CCAAAAAATCAAAAWAGRTGYTG-39 (fragment cox1)
Moll/cox1/5/2 59-ACWGGWTGAACWGTWTAYCC-39 (fragment cox1)
Moll/cox1/3/2 59-ATCWCCWCCWCCWGCWGGRTCRAA-39 (fragment cox1)
Moll/cox3/5/1 59-ACWATAGTWCAATGATGACGNGA-39 (fragment cox3)
Moll/cox3/3/1 59-ACWACWACATCWACAAARTGYCA-39 (fragment cox3)
Moll/cox3/5/2 59-GTWTGTTTTTTTTTTGCWTTTTTYTG-39 (fragment cox3)
Moll/cox3/3/2 59-ACATCWACAAAATGTCARTAYCA-39 (fragment cox3)
Moll/cob/5/1 59-CAAATAWSWTTTTGAGGWGCNAC-39 (fragment cob)
Moll/cob/3/1 59-ATAWGCAAAWAGAAAATAYCAYTC-39 (fragment cob)
Moll/cob/5/2 59-TGAGGWGCWACWGTWATTCANAA-39 (fragment cob)
Moll/cob/3/2 59-TTCWGGTTGAATATGWGTNGGNGT-39 (fragment cob)
Long PCR primers for amplification of W. scintillans mt genome
Wsc/cox1/L/5/1 59-AATTGTTGTAATAAAGTTAATAGCTCCC-39 (fragments AL and AS)
Wsc/cox1/L/5/2 59-AGAAGGTCCAGCATGAGATAAGTTAC-39 (fragments AL and AS)
Wsc/cox1/L/3/1 59-CCTTTCTTTACCTGTACTAGCAGGA-39 (fragments B and D)
Wsc/cox1/L/3/2 59-TATTACTATATTATTAACAGACCGTAAC-39 (fragments B and D)
Wsc/cox3/L/5/1 59-TGCTAGTAAGATAGCAGTGTTTAGTAA-39 (fragments C and D)
Wsc/cox3/L/5/2 59-CTGGATGTAGATAGGAGGTCAGCA-39 (fragments C and D)
Wsc/cox3/L/3/1 59-CGGACTTCATGTTATCATTGGCTCT-39 (fragments AL and AS)
Wsc/cox3/L/3/2 59-TTTCTCCTAACTTGTTTACTCCGAATT-39 (fragments AL and AS)
Wsc/cob/L/5/1 59-TATAGCCGCCAAAATAAAAGGAAGTAA-39 (fragment C)
Wsc/cob/L/5/2 59-TCGTCTCAACGTAGCATTGTCTAC-39 (fragment C)
Wsc/cob/L/3/1 59-TTATTATTATTGATTGAGATTAGGATATTG-39 (fragment B)
Wsc/cob/L/3/2 59-TTTCTTAATGTATTAGGTGATTCTGAG-39 (fragment B)
PCR primers for confirmation of relative position of fragments AL and AS of W. scintillans mt genome
PCR primers for confirmation of gene organization and primary sequence of W. scintillans mt genome
Wsc/rrnS/F/1 (as above) (fragment 1)
Wsc/rrnS/F/2 (as above) (fragment 1)
Wsc/nad3/F/1 (as above) (fragment 1)
Wsc/nad3/F/2 (as above) (fragment 1)
Wsc/atp6/F/1 59-CTTCCTTTAGGTACGCCTAGTTT-39 (fragment 2)
Wsc/atp6/F/2 59-CAGCTAACATTAGAGCAGGGCA-39 (fragment 2)
Wsc/nad2/R/1 59-CTAGTTTTTGCCAGGTTAGTAGC-39 (fragment 2)
Wsc/nad2/R/2 59-TCAGGATAGGGTGGTTGAGTATA-39 (fragment 2)
Wsc/nad2/F/1 59-TATACTCAACCACCCTATCCTGA-39 (fragment 3)
Wsc/nad5/R/1 59-TAGAGGAAGAAGTACTGAATTCGAT-39 (fragment 3)
Wsc/nad5/R/2 59-TATTTCGGGGTGTGTGATGGTG-39 (fragment 3)
Wsc/rrnL/F/1 59-TGAAGCTTATCCCTCATACGATTA-39 (fragment 4)
Wsc/rrnL/F/2 59-CTATACAACGTTAACGCACATCTT-39 (fragment 4)
Wsc/cox1/R/1 59-CTTTCAACAGCTGAAGAAGCTAAT-39 (fragment 4)
Wsc/cox1/R/2 59-TAAATCCATGGGCAGTGACCAC-39 (fragment 4)
Wsc/nad5/F 59-CTAAACCTAAACCATCTCACCC-39 (fragment 5)
Wsc/nad4/R 59-TAAGGGTAGATGTGTGAGGCTT-39 (fragment 5)
Wsc/nad4/F 59-TATAACCCAGCTTGTAGCCGCT-39 (fragment 6)
Wsc/cob/R 59-TGTGGGTGAGTACTTCGTTATG-39 (fragment 6)
Wsc/cob/F 59-TGGCCTCAAGGTAAAACATAACCTA-39 (fragment 7)
Wsc/nad1/R 59-TATTAGGTTCGGTACCGTGCTGT-39 (fragment 7)
Wsc/nad1/F 59-CTCGGTGAGTTTCAGCTACACA-39 (fragment 8)
Wsc/rrnL/R 59-TGTTGCTTGCGGTACTGTAAAGG-39 (fragment 8)
Wsc/nad2/F/2 59-GTACTTTGCTCACCCTCTCCTC-39 (fragment 9)
Wsc/nad2/R/3 59-GAAGTAGATTAGACTAATGGATCTA-39 (fragment 9)
PCR primers for synthesizing gene-specific probes for Southern hybridization
Wsc_cox3_pF 59-CATACAATAAAAGTCTCCTTAGGTATGC-39 (probe for cox3)
Wsc_cox3_pR 59-TGTGAAGTATATACCTAGGATAATGGTGAG-39 (probe for cox3)
Wsc_nad3_pF 59-TAGTAAACTTCTTACTATTACTTATTGGCT-T-39 (probe for nad3)
Wsc_nad3_pR 59-TGATTTCACTCGTGAAAAAGACC-39 (probe for nad3)
Wsc_cox1_pF 59-TGGACATCCAGAAGTGTATATTTT-39 (probe for cox1)
59-TATTGGTGTGTTGTATTTAATAGGAGA-39 (probe for cox1)
59-TGGTAACCCTTACGTTGTCAAGT-39 (probe for rrnS)
59-TTGAAAATAAGGTATGAAAAACTAGGA-39 (probe for rrnS)
59-TACTTCCATACTAATTATTCTTTCCTTAAC-39 (probe for nad2)
59-TAATATAGTTAAAGAAGTAGATTAGACTAATGGA-39 (probe for nad2)
59-TACTAAACCATATTACACTCAAAACACG-39 (probe for cob)
59-CACTCCTCTACATATTAAACCTGAGTG-39 (probe for cob)
NOTE.N A, G, C, or T. R A or G. Y T or C. W A or T. S G or C. Amplified fragments, using the primers, shown
in figure 1 are indicated in parenthesis.
Materials and Methods trn4ltrnTtrnSugacob(39), cob(59)nad6trnPnad1
cox3(59), and cox1(39)cox2trnDatp8atp6trnFtrnV
O. vulgaris, T. pacificus, and W. scintillans were rrnStrnMtrnCtrnQNCRcox3(59), respectively. The
bought at the Tsukiji Fishery Market, Tokyo, Japan. All underlined genes are encoded by the opposite strand.
individuals were caught in the seas around Japan. The Together with five fragments, we identified all known
identification of T. pacificus and W. scintillans was carried genes, but we also found duplicated long NCRs and six
out by Dr. K. Tsuchiya at the Tokyo University of Marine structural genes. Fragments B and C seem to be
Science and Technology. The identification of O. vulgaris neighbors at the ends of the cob gene because there are
was confirmed by comparison of partial cox1 and cox3 no additional cob gene in fragments AS, AL, or D.
sequences with published octopod sequences, including Therefore, the order of these five fragments might be
that of O. vulgaris (Bonnaud, Boucher-Rodoni, and BCASDAL(B) or BCALDAS(B).
ConfirmaMonnerot. 1997; Carlini, Young, and Vecchione 2001). tion of the gene arrangement for the duplicated regions
DNA Isolation, PCR, Cloning, and Sequencing was determined by additional PCR analyses. The PCR
primers specific for rrnS (on fragment D) and nad3 (on
The DNA sequence determination strategies for W. fragment AS), listed in table 1, were synthesized.
Fragscintillans, T. pacificus, and O. vulgaris mt genomes were ments were amplified with nad3-39 primer and rrnS-39
the same as those used for the Ciona savignyi mt genome primer by PCR, but no other combination of PCR
(Yokobori, Watanabe, and Oshima 2003). As an example, primers gave amplified fragments. Therefore, the order
a brief description of the methods used to determine the of the fragments is thought to be BCASDAL(B)
sequence of the W. scintillans mt genome follows. First, (fig. 1). Furthermore, PCR products (0.5 to 5 kbs)
parts of cox1, cox3, and cob were amplified by nested covering the entire W. scintillans mtDNA were amplified
PCR, using total DNA as the template with EX-Taq DNA (fragments 1 to 9), cloned, and sequenced (fig. 1). The
polymerase (TAKARA). The primers used for the sequences of the PCR primers were adapted from the first
amplification of partial sequences of cox1, cox3, and cob version of the complete nucleotide sequence of W.
are listed in table 1. Amplified fragments were cloned with scintillans mtDNA. The sequences of these primers are
the TOPO TA cloning kit (Invitrogen) and sequenced listed in table 1. For the amplification of long fragments
using a PRISM 3100 DNA autosequencer (Applied (longer than 3 kbps; fragments 1 to 4), fragments were
Biosystems). For sequence reactions, BigDye Terminator subjected to nested long PCR. The PCR primers used for
version 3.1 (Applied Biosystems) was used. Using the the nested PCR are listed in table 1. PCR conditions were
sequence information, the remaining parts of the mt as described above. Amplified PCR products (fragments 1
genomes were amplified by nested long PCR (the primers to 8) were cloned as described above. Both strands of these
used are listed in table 1) with LA Taq DNA polymerase fragments were determined by primer-walking. Fragment
(TAKARA). Among all combinations of PCR primers, 9, once amplified, was purified with the QIAquick PCR
four combinations of PCR primers, (A) cox3-39 and cox1- purification kit (QIAgen) according to the manufacturers
59, (B) cox1-39 and cob-39, (C) cob-59 and cox3-59, and protocol. The resulting purified PCR product was then
(D) cox1-3. cox3-59 gave amplified fragments. The lengths directly sequenced. Similar sequencing strategies were
of the PCR fragments are approximately 1 kbps and 2 kbps applied for the remaining T. pacificus and O. vulgaris mt
for (A) (fragment AS and AL, respectively), approximately genomes (details are not shown).
6 kbps for (B) (fragment B), and approximately 4.5 kbps Identification of protein and rRNA genes was carried
for (C) and (D) (fragments C and D). The amplified out by comparing them with counterparts from L. bleekeri
fragments were cloned and sequenced. The two fragments (Tomita et al. 2002) and K. tunicata (Boore and Brown
amplified with the primers specific for cox3-39 and 1994) mt genomes. Identification of tRNA genes was
carcox1-59 have the cox3 gene at one end and the cox1 gene ried out manually, by making visual searches of the
cloverat the other end. However, their gene organizations are leaf structures. The complete nucleotide sequences of W.
cox3(39)trnAtrnNtrnInad3cox1(59) for fragment scintillans, O. vulgaris, and T. pacificus mt genomes were
AS and cox3(39)trnKtrnRtrnS(gcu)nad2cox1(59) entered into the DDBJ/EMBL/GenBank DNA databases
for fragment AL. The orders in fragments B, C, and D under the accession numbers AB086202, AB158363,
are cox1(39)cox2trnDatp8atp6nad5trnHnad4 and AB158364, respectively.
Southern Analysis of the W. scintillans mt Genome
Total DNA of W. scintillans was prepared from either
the liver or eggs of a single animal by QIAgen Genometip
(QIAgen). W. scintillans total DNA was digested with Apa
I, Bgl II, Nco I, Pst I, Sal I, and Xho I. Six genes, cox1,
cox3, cob, nad2, nad3, and rrnS, were the targets of
Southern hybridization. Labeled probes with DIG
(Digoxigenin) (ca. 300 bp each) were synthesized with a
DIGPCR probing kit (Roche Diagnostics). Sequences of PCR
primers are as listed in table 1. Nondigested and digested
W. scintillans total DNA, 0.5 lg each, was used for 0.6 %
agarose gel electrophoresis in TAE, followed by alkali
transfer of the DNA to nylon membrane Hybond N1
(Amersham Biosciences). Southern hybridization was
performed with UltraHyb (Roche Diagnostics) as
described by the manufacturer. A DIG Nucleic Acid
Detection kit (Roche Diagnostics) was used for detection
as described by the manufacturer.
All mt protein genes, except atp8, were used for
phylogenetic analyses. The following complete nucleotide
sequence entries were retrieved from GenBank: K. tunicata
(U09810), L. bleekeri (AB029616), Inversidens japanensis
(female type) (AB055625), Terebratulina retusa
(AJ245743), Lumbricus terrestris (U24570), Platynereis
dumerii (AF178678), Limulus polyphemus (AF216203),
Artemia franciscana (X69067), Drosophila yakuba
(X03240), Lithobius forficatus (AF309492), Homo sapiens
(J01415), Balanoglossus carnosus (AF051097), Asterina
pectinifera (D16387), and Metridium senile (AF000023).
The amino acid sequences of each protein gene with its
counterparts in O. vulgaris, T. pacificus, and W. scintillans
were aligned by application of ClustalX (Thompson et al.
1997) using the default settings. The best-aligned regions
were selected by GBLOCKS (Castresana 2000). The
selected regions of all protein genes were then concatenated
(2,592 sites). For maximum-likelihood (ML) analysis
(siteby-site rate variations), Tree-Puzzle version 5.1 (Schmidt et
al. 2002) was used under the conditions of the mtREV24
substitution model and one invariable and eight-class
discrete gamma distribution model for site-by-site rate
variations. In total, 16 % of sites were estimated to be
invariable, and from the data, the shape parameter a of
gamma distribution was estimated to be 0.92. In addition,
FIG. 1.PCR amplification strategy for the W. scintillans mt
genome. In the top row, the inferred gene organization of W. scintillans
mt genome is shown. Circular genomes are presented linearly to ease
comparison of gene organization. Thick lines are the first PCR fragments.
Lines with black arrowheads at both ends are long PCR products. The
names of the fragments (AL, AS, B, C, and D) are defined in the text. The
PCR fragments (1 to 9), indicated by lines with white arrowheads at both
ends, are PCR fragments for sequence confirmation as noted in the text.
Each tRNA gene is indicated by the letter corresponding to the
appropriate amino acid. L1, L2, S1, and S2 indicate trnL(uaa), trnL(uag),
trnS(uga), and trnS(gcu), respectively. The protein and rRNA genes
encoded by the opposite strand are shown by gray boxes. The tRNA
genes encoded by the opposite strand are shown below the column. Long
NCRs are indicated by black boxes.
the ML tree without site-by-site rate variation was estimated
using the PROTML routine in MOLPHY version 2.3b
(Adachi and Hasegawa 1996a). The topology of the
neighbor-joining (NJ) tree, which was constructed with
the ML distance matrix estimated by PROTML (D option),
was used as the initial tree for the ML tree search by the NNI
search routine of PROTML.
Furthermore, four cephalopods and K. tunicata were
used for ML analysis (3,505 sites). A possible 15 unrooted
tree topologies were then used for the ML analysis with
PAML (Yang 1997) (eight-class discrete gamma
distribution model) and with Tree-Puzzle (Schmidt et al. 2002)
(one invariable and eight-class discrete gamma distribution
well conserved (66.7% to 66.8%). The 59 part is conserved
more than the central region and the 39 part of the NCRs
between the T. pacificus and W. scintillans mt genomes.
The L. bleekeri NCR is less similar to the W. scintillans
and T. pacificus NCRs, but these three NCR sequences are
easily aligned. The O. vulgaris NCR shares some
characteristics in its primary sequences with squid NCRs,
but alignment of the O. vulgaris NCR with squid NCRs
are not easy. Near the 59 end of the NCRs, a short sequence,
59-TATATATAATAAACA-39, is conserved between
T. pacificus and W. scintillans, and two similar sequences,
59-TGTATATAATATACA-39, are found near the 59 ends of the L. bleekeri
and O. pacificus NCRs, respectively. At the 39 end half, all
four cephalopod NCRs carry a C-rich track. These shared
characteristics among cephalopod NCRs possibly
contribute to the functions of the NCRs in replication and
transcription initiation. However, further studies are
required to determine this for certain.
The O. vulgaris mt genome is 15,744 bp long. It
encodes the standard set of metazoan mt genes, and the Southern Aanalysis for Confirmation of W. scintillans
gene organization is similar to that of the polyplacophoran Gene Organization
K. tunicata (Boore and Brown 1994); there are only two
differences (translocation of trnD and inversion of trnP) To confirm gene duplication in the W. scintillans mt
(fig. 2). genome, Southern analysis was performed. The predicted
In contrast to the O. vulgaris mt genome, the W. lengths and locations of the restriction fragments of W.
scintillans and T. pacificus mt genomes exhibit several scintillans mtDNA and the predicted hybridization
patunusual features. The mt genomes are 20,091 bp long and terns are shown in figure 3A and B.
20,254 bp long, respectively, and both carry the standard Using the restriction enzymes that were cut once for
sets of metazoan mt genes, but six structural genescox3, W. scintillans mtDNA (Apa I and Xho I) caused all probes
cox1, cox2, trnD, atp8, and atp6and the longest derived from cox1, cox3, nad2, nad3, cob, and rrnS to
noncoding region are duplicated (fig. 2). The duplication hybridize to the same band (fig. 3C). As predicted (fig. 3A
patterns of these two genomes are not simple. The first and B), the specific probe for cox1, which might be
duplicate copy contains a four-gene insertiontrnA, trnN, duplicated, hybridized to two distinct bands of W.
trnI, and nad3between cox3 and cox1, whereas the scintillans DNA treated with either Bgl II, Nco I, or Pst
second copy carries a different insertiontrnK, trnR, I (fig. 3C). Again, as predicted (fig. 3A and B), the
cox3trnS(gcu), and nad2between cox3 and cox1. In addition, specific probe also hybridized to two distinct bands of W.
the two duplicates are separated by the following genes: scintillans DNA treated with either Nco I, Pst I, or Spe I
trnF, trnV, rrnS, trnM, trnC, and trnQ. Otherwise, the mt (fig. 3C). Conversely, the other four probes hybridized to
genome structures of W. scintillans and T. pacificus are a single band (fig. 3C). All the hybridization patterns
nearly identical; only the location of trnM differs between match the gene organization of the W. scintillans mt
them. genome presented in figures 1 and 2A. Note that the W.
The nucleotide compositions of O. vulgaris, T. scintillans DNA used for the Southern hybridization
pacificus, and W. scintillans mt genomes are 41.2 % A, analysis was prepared from a different individual that
33.2 % T, 7.6 % G, and 17.6 % C for O. vulgaris; 38.4 % those used for sequence determination. In addition, we
A, 34.2 % T, 9.9 % G, and 17.5 % C for T. pacificus; and could not find any additional bands that are not inferred
35.3 % A, 33.4 % T, 11.6 % G, and 19.2 % C for W. from the determined nucleotide sequence shown in figure
sintillans, respectively. These values are similar to those 3C. Therefore, the observed gene duplication in the W.
in the L. bleekeri mt genome (Tomita et al. 2002). The scintillans mt genome could be common for the W.
nucleotide composition of NCRs of O. vulgaris, T. scintillans mt genome. Thus, the gene organizations of the
pacificus, and W. scintillans mt genomes are 44.0 % A, W. scintillans and T. pacificus mt genomes shown in figure
37.6 % T, 4.6 % G, and 13.8 % C for O. vulgaris; 40.4 % A, 2 are suggested.
39.3 % T, 5.5 % G, and 14.8 % C for T. pacificus; and 39.0 %
A, 35.7 % T, 9.1 % G, and 16.2 % C for W. sintillans, Concerted Evolution
respectively. In all three cephalopod mt genomes, the NCRs
are slightly richer in A and T than are other regions. The sequences of the duplicated regions are nearly
O. vulgaris, T. pacificus, and W. scintillans mt NCRs identical within each species, but there are large
differform stem-and-loop structures at the both ends (data not ences when comparing between W. scintillans and T.
shown). In addition, within their central regions, one or pacificus (table 2). From the analyses of duplicated
more stem-and-loop structures can also be formed. The noncoding regions in various metazoan mt genomes, it
NCRs of T. pacificus and W. scintillans mt genomes are has been suggested that there are concerted evolution
processes in metazoan mitochondria (cf. Kumazawa et al.
1998). Because the W. scintillans and T. pacificus mt
genomes belonging to different families share gene
duplication, and the duplicated regions in the same species
have nearly identical sequences, it should be concluded
that concerted evolution processes exist within cephalopod
mt genomes. Duplicated units in the W. scintillans and T.
pacificus mt genomes are not placed in tandem, meaning
that a simple slippage model cannot explain their sequence
homogeneity; however, inter/intramolecular
recombination might provide an adequate explanation. In certain
molluscan species, two types of mtDNA (male and female
types) coexist within a single cell (e.g., Zouros et al. 1994).
Some reports suggest the existence of intermolecular
recombination (Ladoukakis and Zouros 2001), supporting
our conclusion that recombination is the possible
mechanism maintaining the homogeneity of the duplicated
sequences in W. scintillans and T. pacificus mt genomes.
Furthermore, recombination activity has also been reported
from mammalian mitochondria (Thyagarajan, Padua, and
To understand the effect of gene duplication on the
evolutionary rate, the relative rate test (RRTree
[RobinsonRechavi and Huchon 2000]) of each protein gene (amino
acid sequence level) was performed. When the evolutionary
rate is compared among O. vulgaris, L. bleekeri, T.
pacificus, and W. scintillans (using K. tunicata as an
outgroup), there are no significant differences in the
evolutionary rate of any species pair (data not shown).
When the evolutionary rate is compared among squid
species (using O. vulgaris as an outgroup), only the
hypothesis that T. pacificus and W. scintillans nad4 evolved
at the same rate is statistically rejected (P , 0.01) (data not
shown). These results suggest that neither the duplication of
genes or concerted evolutionary process seriously affect the
evolutionary rate of the genes themselves.
Conversely, recent independent duplications in the W.
scintillans and T. pacificus mt genomes cannot be rejected.
This scenario would explain the occurrence of nearly
identical sequences between duplicates in a single genome,
but different sequences between different species
duplicated regions. However, as we discuss later, for such
a situation to be at all possible, at least two duplication
events and several subsequent gene-loss events would
have needed to take place independently, but in the same
order, in both the T. pacificus and W. scintillans lineages.
This extent of parallel evolution in the T. pacificus and W.
scintillans mt genomes requires far more assumptions than
the hypothesis that concerted evolution maintains
homogeneity of the duplicated sequences within a species.
Therefore, we believe that the model of ancestral gene
duplication in T. pacificus and W. scintillans mt genomes
followed by a concerted evolution process that
homogenized the duplicated sequences is more likely than the
model of recent and independent gene duplications in both
T. pacificus and W. scintillans mt genomes.
As shown in table 2, there are four differences
between duplicates of the protein genes in the W.
scintillans mt genome. These differences are found at the
second position of codons. This means that the resultant
proteins of these genes have different amino acid
sequences. These differences can change the property of
proteins originated from another copy of the duplicated
genes. However, all the substituted positions are at the
poorly conserved regions, suggesting that their effects on
the proteins might not be serious. In addition, when the
sequenced PCR clones were checked, nucleotide
variations are found at different sites within the duplicated
genes. The minor nucleotides of PCR clones at various
positions in the one of duplicates are identical to the
nucleotides at the corresponding positions in the other
duplicates. The observed sequence differences between the
copies of duplicated regions might be because of
polymorphism, although the possibility of PCR errors cannot
be ignored. Does this situation suggest relaxed functional
constraints in the duplicated genes? To address this
question, further analyses of the function of these
duplicated genes are needed.
Evolution of Cephalopod mt Gene Arrangement
Within both this study and our previous study
(Tomita et al. 2002), we have reported four complete
cephalopod mt genome nucleotide sequences: L. bleekeri,
W. scintillans, T. pacificus, and O. vulgaris. The O.
vulgaris mt genome appears to retain a greater level of
ancestral gene organization than the other species because
the O. vulgaris mt gene organization is nearly identical to
that of the polyplacophoran K. tunicata (fig. 2).
When the O. vulgaris and K. tunicata mt genomes are
compared, one ancestral gene location and one derived
gene location can be distinguished. In the case of the O.
vulgaris mt genome, the location of trnD is the ancestral
feature, whereas the direction of trnP is the derived
feature. Conversely, in the K. tunicata mt genome, the
direction of trnP is the ancestral feature, whereas the
location of trnD is the derived feature. As Tomita et al.
(2002) have discussed, a trnD gene located between cox2
and atp8 is found in the Littorina saxatilis (Mollusca and
Gastropoda) mt genome (Wilding, Mill, and Grahame
1999) and various other metazoan mt genomes such as
several arthropod mt genomes (e.g., Clary and
Wolstenholme 1985; Lavrov, Boore, and Brown 2000). This
finding suggests that the location of trnD in the O. vulgaris
mt genome is the ancestral feature for that in the K.
tunicata mt genome. On the other hand, the same direction
of trnP in K. tunicata, rather than that observed in O.
vulgaris, is found in these other mt genomes (e.g., Clary
and Wolstenholme 1985; Lavrov, Boore, Brown 2000),
suggesting that it is the ancestral feature for the direction
observed in the O. vulgaris mt genome.
Using a model in which slippage during replication
causes gene duplication, after which random gene loss
causes changes in gene organization, differences in the
gene organization of K. tunicata and O. vulgaris mt
genomes can be explained (fig. 4A). A tRNA-like
cloverleaf structure (anticodon AGA) is found between
cox2 and atp8 in the K. tunicata mt genome (Boore and
Brown 1994). However, we found another possible
tRNAlike structure in this region (positions 2810 to 2871,
overlapping the tRNASer-like structure [antiocodon
AGA]). The sequence of the cloverleaf structure is very
FIG. 3 (Continued)
similar to that of trnD (78% identity), although the
anticodon loop of the former structure is composed of
eight nucleotides (59-TTATTTAA-39) instead of seven
nucleotides. Thus, the cloverleaf structure between cox2
and atp8 in the K. tunicata mt genome seems to be
a pseudogene of trnD. Pseudogenes of trnD could be
created during the rearrangement process, as in the case of
the L. bleekeri mt genome, in which the creation of the
trnH pseudogene by a similar means has been reported
(Sasuga et al. 1999).
Both W. scintillans and T. pacificus mt gene
organizations could have originated from an Octopus-type
mt gene organization, through two gene-duplication events
followed by the loss of one of the duplicated genes (fig.
4B). This process appears to be a typical sequence of
events in the evolution of gene organization. The problem
is that the W. scintillans and T. pacificus mt genomes carry
six duplicated genes. Because there are only simple
sequence differences between the duplicated regions,
including those of the long NCRs in which the
transcription initiation site can be located (table 2), both
duplicated genes could be functional.
As discussed above (see also figure 4B), many
processes (duplications and loss of genes and NCR) are
needed to create W. scintillans and T. pacificus mt gene
organizations from Octopus-type mt gene organization.
However, the position of trnM is the only difference
between the W. scintillans and T. pacificus mt genomes. If,
as the result of parallel evolution of W. scintillans and T.
pacificus mt genomes, the gene duplication and gene
organization of W. scintillans and T. pacificus mt genomes
were created independently, not just one but many
independent, identical gene losses and gene
rearrangements would have been needed, although we do not have
to presume the existence of a concerted evolution process.
For example, parallel evolution on mt gene organization
has been suggested in bird mt genomes (Mindell,
Sorenson, and Dimcheff 1998). However, we prefer the
ancient duplication model rather than the parallel evolution
model, because the former requires fewer gene
rearrangement/gene loss events than the latter.
Why Are Duplicated Genes Maintained in the
W. scintillans and T. pacificus mt Genomes?
Several researchers have pointed out that an mt
genome with a duplicated functional long NCR has several
advantages over one with only a single NCR (e.g.,
Kumazawa et al. 1998). Because the long NCR contains
the initiation region for replication, mt genomes with
multiple long NCRs can be replicated from multiple
points, whereas those with only a single long NCR can
only be replicated from one point. An mt genome with
multiple replication origins might, therefore, be capable of
faster replication, provided the duplicated region is not too
long. In mitochondria, there are multiple copies of
mtDNA, and there must be a certain degree of selection
pressure between individual mtDNAs. Therefore, mt
genomes with multiple replication origins might be
expected to proliferate at the expense of those with only
a single replication origin.
Are there any advantages for maintaining duplicate
copies of functional structural genes in W. scintillans and
T. pacificus mt genomes? Among metazoan mt genomes
for which complete nucleotide sequences have been
published, there is only one (Venerupis philippinarum
FIG. 3.Southern analysis of the W. scintillans mt genome. (A) Locations of probes (vertical arrows) and restriction fragments of the W. scintillans
mt genome. (B) Expected results of the Southern hybridization experiment on the W. scintillans mt genome. Abbreviations for each gene are as follows.
III: cox3; I: cox1; 3: nad3; 2: nad2; B: cob; and S: rrnS. Abbreviations for each restriction enzyme are as follows. C: control (noncut); A: Apa I; B: Bgl
II; N: Nco I; P: Pst I; S: Spe I; and X: Xho I. (C) Southern hybridization. The results of hybridizations of various probes with W. scintillans total DNA
are shown. The noncut DNA (C), Apa Itreated DNA (A), Bgl IItreated DNA (B), Nco Itreated DNA (N), Pst Itreated DNA (P), Spe Itreated DNA
(S), and Xho Itreated DNA (X) were loaded for each gel. In the lower column for each hybridization image, the name of the probe is shown. All
indicates that a mixture of probes for all six genes was used for hybridization.
Nucleotide Sequence Similarities (%) Between Duplicated Regions of W. scintillans and T. pacificus mt Genomes
Wsc Copy 1 vs. 2
Tpa Copy 1 vs. 2
[AB065375]) that encodes duplicated protein genes
(cox2). However, in this genome, the two cox2 genes
have very different sequences. As we have shown above,
this is not the case for the duplicate genes in W. scintillans
and T. pacificus mt genomes.
Consider that each subunit of NADH dehydrogenase,
for which the gene is encoded by the mt genome, is
equally synthesized and that the rate of translation of each
subunit of NADH dehydrogenese gene is also equal. If the
two assumptions of equal transcription rate of genes and
equal translation rate of NADH dehydrogenase genes are
accepted, the duplicated genescox3, cox1, cox2, trnD,
atp8, and atp6should be transcribed twice as frequently
as nad2, nad3, and other NADH dehydrogenase subunit
genes. The duplicated protein genes in W. scintillans and
T. pacificus mt genomes encode subunits of complex IV
and complex V but not complex I and complex III.
The transcription frequency of rRNA molecules is
differently controlled from those of mRNA and tRNA
molecules in vertebrate mitochondria (e.g., Clayton 1992).
In addition to the transcript covering the entire genome,
another transcript containing only srRNA, lrRNA, tRNAPhe,
and tRNAVal is also synthesized. Thus, in vertebrate
mitochondria, the copy numbers of rRNAs are generally
maintained at a higher level than those of tRNA and mRNA
(Clayton 1992). A similar situation might exist for rRNA
expression in the O. vulgaris mt genome. In the cases of W.
scintillans and T. pacificus mt genomes, two rRNA genes
are encoded at the distinct positions. However, both rrnL
and rrnS are flanked at both ends by tRNA gene(s), and the
long NCRs are located upstream from the tRNA genes,
which are next to the 59 ends of rrnL and rrnS. Because the
long NCRs have nearly identical sequences, the number of
transcripts of rrnL and rrnS can be controlled in a similar
Molecular Phylogenetic Analysis
Quartet-puzzling (QP) analysis with Tree-Puzzle
(Schmidt et al. 2002) under the invariable and gamma
model (mtREV model) shows monophyly of oegopsids (W.
scintillans and T. pacificus) (fig. 5). L. bleekeri, a
representative of the myopsids in this tree, is the sister group of the
oegopsid squids, and O. vulgaris is the sister taxon of the
decapods in this tree. In this tree, cephalopods form
a monophyletic group with K. tunicata (Polypracophora)
and Inversidens (Bivalvia). However, statistical support for
the monophyly of the Mollusca is not sufficiently high. On
the other hand, Terebratulina retusa, a representative of the
Brachiopoda, forms a group with the annelid species in this
tree, although the statistical support for this group is not
particularly high. The monophyly of three groups
(Mollusca, Annelida, and Brachiopoda) of the Lophotrochozoa is
well supported in this tree, as in our previous analysis
(Tomita et al. 2002), but the relationship among the major
groups of Mollusca, Annelida, and Brachiopoda could not
be resolved from the present data.
When we performed a ML analysis on five molluscan
speciesK. tunicata (outgroup), O. vulgaris, L. bleekeri, T.
pacificus, and W. scintillansusing CODEML in PAML
(Yang 1997) (eight-class discrete gamma distribution
model), we found that W. scintillans and T. pacificus are
monophyletic, and L. bleekeri is their sister taxon in the ML
tree (lnL 224077.52). Next, the tree in which L. bleekeri
and T. pacificus are monophyletic is the second tree
(difference of lnL from best tree with SE 25.49 6
10.84), and this tree cannot be rejected on the basis of BP
values, the Kishino-Hasegawa test (Kishino and Hasegawa
1989), the Shimodaira-Hasegawa test (Shimodaira and
Hasegawa 1999), or differences in log-likelihood. On the
other hand, in the ML tree constructed with Tree-Puzzle (one
FIG. 4.Possible pathways for how W. scintillans and T. pacificus mt gene organizations originated from the ancestral gene organizations found in
O. vulgaris and K. tunicata mt genomes. (A) Between O. vulgaris and K. tunicata mt genomes. (B) From the O. vulgaris type mt genome to the W.
scintillans mt genome. In both models, one of two ends of the duplicated unit is under the constraint of being in the noncoding region. The a and b
mean of the seven-tRNA gene cluster appears upstream of NCR and the five-tRNA gene cluster appears between cox3 and nad3, respectively, in O.
vulgaris and K. tunicata mt genomes and their corresponding regions in other mt genomes such as W. scintillans mt genome. Abbreviations of tRNA
genes are as in figure 1. Abbreviations of protein and rRNA genes are as follows: C1 to C3 for cox1 to 3, CB for cob, A6 and A8 for atp6 and 8, N1 to
N6 and NL for nad1 to 6 and nad4L, and LR and SR for rrnL and rrnS.
FIG. 5.ML analysis of 17 metazoans based on the amino acid sequences of concatenated 12-mt protein genes. The ML tree inferred with Tree-Puzzle
is shown. Support for the internal branches of the quartet puzzling tree topology (%) (left) and local bootstrap probability of the internal branches of ML tree
estimated with PROTML (%) (right) are shown at the node. The log-likelihood of the tree estimated with Tree-Puzzle is 249482.91. The log-likelihood of
this tree (6SE) estimated with PROTML is 252542.78 (6696.28). The topologies of the trees obtained with these two methods are identical.
invariable and eight-class discrete gamma distribution
model), L. bleekeri and T. pacificus appear as the sister taxa
(lnL 223817.07). The best tree chosen by CODEML is
the second tree (difference of lnL from best tree with SE
12.08 6 9.65), and it cannot be rejected.
The relationship among W. scintillans, T. pacificus,
and L. bleekeri could not be resolved by our analyses,
although a closer relationship between W. scintillans and
T. pacificus is suggested by the mt genome structures. This
finding, in turn, suggests that the squid radiation has
occurred over a rather short period. In addition, the
monophyly of the Mollusca could not be supported. The
radiation of the Mollusca, Annelida, and Brachiopoda
might also have occurred over a short period.
The T. pacificus and W. scintillans mt genomes are
the first examples of metazoan mt genomes that have been
found to have stably carried duplicated structural genes
over a long period. Together with the L. bleekeri mt
genome (Tomita et al. 2002), which carries a triplicated
NCR with nearly identical sequences, the cephalopod, at
least squid, mitochondria are concluded to have certain
concerted evolutionary processes. Analyses of other
cephalopod mt genome structures will tell us how
complicated squid mt genomes really are.
On the other hand, how such mt genomes with
duplicated noncoding regions and structural genes maintain
their functions, such as replication and transcription, is an
unresolved but important issue. The means of replication,
transcription, and other processes within cephalopod mt
genomes provide an interesting target for further studies.
University of Pharmacy and Life Science for his valuable
comments. This work was supported by grants to T.O. and
S.Y. from Ministry of Education, Culture, Sports, Science
and Technology, Japan.
Naruya Saitou, Associate Editor