Long-Term Conservation of Six Duplicated Structural Genes in Cephalopod Mitochondrial Genomes

Molecular Biology and Evolution, Nov 2004

The complete nucleotide sequences of the mitochondrial (mt) genomes of three cephalopods, Octopus vulgaris (Octopodiformes, Octopoda, Incirrata), Todarodes pacificus (Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Decapodiformes, Oegopsida, Enoploteuthidae), were determined. These three mt genomes encode the standard set of metazoan mt genes. However, W. scintillans and T. pacificus mt genomes share duplications of the longest noncoding region, three cytochrome oxidase subunit genes and two ATP synthase subunit genes, and the tRNAAsp gene. Southern hybridization analysis of the W. scintillans mt genome shows that this single genome carries both duplicated regions. The near-identical sequence of the duplicates suggests that there are certain concerted evolutionary mechanisms, at least in cephalopod mitochondria. Molecular phylogenetic analyses of mt protein genes are suggestive, although not statistically significantly so, of a monophyletic relationship between W. scintillans and T. pacificus.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://mbe.oxfordjournals.org/content/21/11/2034.full.pdf

Long-Term Conservation of Six Duplicated Structural Genes in Cephalopod Mitochondrial Genomes

Shin-ichi Yokobori 0 Naoya Fukuda 0 Mitsue Nakamura 0 Tomoko Aoyama 0 Tairo Oshima 0 0 Laboratory for Cellular Biochemistry, Department of Molecular Biology, School of Life Science, Tokyo University of Pharmacy and Life Science , Tokyo , Japan The complete nucleotide sequences of the mitochondrial (mt) genomes of three cephalopods, Octopus vulgaris (Octopodiformes, Octopoda, Incirrata), Todarodes pacificus (Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Decapodiformes, Oegopsida, Enoploteuthidae), were determined. These three mt genomes encode the standard set of metazoan mt genes. However, W. scintillans and T. pacificus mt genomes share duplications of the longest noncoding region, three cytochrome oxidase subunit genes and two ATP synthase subunit genes, and the tRNAAsp gene. Southern hybridization analysis of the W. scintillans mt genome shows that this single genome carries both duplicated regions. The near-identical sequence of the duplicates suggests that there are certain concerted evolutionary mechanisms, at least in cephalopod mitochondria. Molecular phylogenetic analyses of mt protein genes are suggestive, although not statistically significantly so, of a monophyletic relationship between W. scintillans and T. pacificus. - The types and number of genes encoded by metazoan mitochondrial (mt) genomes are well conserved among various metazoan species for which these data are currently available (see Boore [1999]). Most metazoan mt genomes are circular, carrying single copies of 12 protein (cox1 to 3 [cytochrome oxidase subunits I to III], nad1 to 6 and 4L [NADH dehydrogenase subunits 1 to 6 and 4L], atp6 and 8 [ATP synthase subunits 6 and 8], and cob [apocytochrome b]), 2 rRNA (rrnL and rrnS [large and small subunit ribosomal RNAs]), and 22 tRNA genes (trnA, etc.). None or few intergenic nucleotides are found, with the exception of long noncoding regions (NCR) that contain control elements for replication and transcription. However, gene organization varies among metazoan species. For example, most molluscan mt genomes reported so far have different gene organizations, and there are large differences in mt gene organization within each class (Hoffmann, Boore, and Brown 1992; Boore and Brown 1994; Hatzoglou, Rodakis, and Lecanidou 1995; Terrett, Miles, and Thomas 1996; Yamazaki et al. 1997; Kurabayashi and Ueshima 2000; Tomita et al. 2002; Grande et al. 2002, Wilding, Mill, and Grahame 1999). In pulmonate land snails, variation in gene organization is found at the level of the superfamily (Yamazaki et al. 1997). Multiplication of NCRs has been observed in various metazoan mt genomes (e.g., Kumazawa et al. 1998). In some cases, the duplication of coding regions has also been reported. For example, the oyster Crassostrea gigas mt genome (GenBank/EMBL/DDBJ accession number AF177226) carries two copies of rrnS. The nematode Romanomermis culicivorax has also been reported to carry multiple copies of protein genes (Azevedo and Hyman 1993; Hyman and Azevedo 1996; Hyman, Beck, and Weiss 1988). In these cases, a high sequence similarity between duplicated genes has been observed. On the other hand, gene duplication resulting from one of the duplicates becoming a pseudogene has also been reported. For example, the partial duplication of the gecko Heteronotia binoei mt genome has been reported, and one of the duplicates appears to be a pseudogene (Zevering et al. 1991). Sasuga et al. (1999) identified a pseudogene of trnH between nad4 and nad5 in the Loligo bleekeri mt genome, where trnH is located in the Katharina tunicata mt genome (Boore and Brown 1994). trnH was found to be at a different position (Sasuga et al. 1999; Tomita et al. 2002). It has, thus, been proposed that the metazoan mt genome is under strong selective pressure for genome minimization. Recently, Tomita et al. (2002) reported the first complete cephalopod mt nucleotide sequence for the L. bleekeri mt genome. The gene content of the L. bleekeri mt genome is the same as that of the typical metazoan mt genome. However, the arrangement of genes within the mt genome is different from that of any other metazoan reported to date. One of the most notable characteristics of the L. bleekeri mt genome is that it contains three nearidentical, 500-bp NCRs. These NCRs are not placed on the genome in tandem; instead, their placement seems to be closely related to the gene rearrangement that has taken place in the L. bleekeri mt genome (Tomita et al. 2002). In addition, concerted evolution could be considered as the mechanism underlying the maintenance of high similarity within these NCR sequences. To further our understanding of the evolution of mt genome structures in cephalopods, we determined the complete mt genome nucleotide sequences of three cephalopods, Octopus vulgaris (Coleoidea, Neocoleoidea, Octopodiformes, Octopoda, Incirrata, Octopodidae), Todarodes pacificus (Coleoidea, Neocoleoidea, Decapodiformes, Oegopsida, Ommastrephidae), and Watasenia scintillans (Coleoidea, Neocoleoidea, Decapodiformes, Oegopsida, Enoploteuthidae). The genome structure of the O. vulgaris mt genome is rather similar to that of the K. tunicata mt genome. However, the T. pacificus and W. scintillans mt genomes contain long and complicated duplications of long NCRs and six structural genes. We conclude our report with a discussion of the evolution of cephalopod mt genomes. PCR Primer Sequences of Analysis of W. scintillans mt Genomes PCR primers for amplification of cox1, cox3, and cob partial sequences Moll/cox1/5/1 59-ATAATTGGWGGWTTTGGWAAYTG-39 (fragment cox1) Moll/cox1/3/1 59-CCAAAAAATCAAAAWAGRTGYTG-39 (fragment cox1) Moll/cox1/5/2 59-ACWGGWTGAACWGTWTAYCC-39 (fragment cox1) Moll/cox1/3/2 59-ATCWCCWCCWCCWGCWGGRTCRAA-39 (fragment cox1) Moll/cox3/5/1 59-ACWATAGTWCAATGATGACGNGA-39 (fragment cox3) Moll/cox3/3/1 59-ACWACWACATCWACAAARTGYCA-39 (fragment cox3) Moll/cox3/5/2 59-GTWTGTTTTTTTTTTGCWTTTTTYTG-39 (fragment cox3) Moll/cox3/3/2 59-ACATCWACAAAATGTCARTAYCA-39 (fragment cox3) Moll/cob/5/1 59-CAAATAWSWTTTTGAGGWGCNAC-39 (fragment cob) Moll/cob/3/1 59-ATAWGCAAAWAGAAAATAYCAYTC-39 (fragment cob) Moll/cob/5/2 59-TGAGGWGCWACWGTWATTCANAA-39 (fragment cob) Moll/cob/3/2 59-TTCWGGTTGAATATGWGTNGGNGT-39 (fragment cob) Long PCR primers for amplification of W. scintillans mt genome Wsc/cox1/L/5/1 59-AATTGTTGTAATAAAGTTAATAGCTCCC-39 (fragments AL and AS) Wsc/cox1/L/5/2 59-AGAAGGTCCAGCATGAGATAAGTTAC-39 (fragments AL and AS) Wsc/cox1/L/3/1 59-CCTTTCTTTACCTGTACTAGCAGGA-39 (fragments B and D) Wsc/cox1/L/3/2 59-TATTACTATATTATTAACAGACCGTAAC-39 (fragments B and D) Wsc/cox3/L/5/1 59-TGCTAGTAAGATAGCAGTGTTTAGTAA-39 (fragments C and D) Wsc/cox3/L/5/2 59-CTGGATGTAGATAGGAGGTCAGCA-39 (fragments C and D) Wsc/cox3/L/3/1 59-CGGACTTCATGTTATCATTGGCTCT-39 (fragments AL and AS) Wsc/cox3/L/3/2 59-TTTCTCCTAACTTGTTTACTCCGAATT-39 (fragments AL and AS) Wsc/cob/L/5/1 59-TATAGCCGCCAAAATAAAAGGAAGTAA-39 (fragment C) Wsc/cob/L/5/2 59-TCGTCTCAACGTAGCATTGTCTAC-39 (fragment C) Wsc/cob/L/3/1 59-TTATTATTATTGATTGAGATTAGGATATTG-39 (fragment B) Wsc/cob/L/3/2 59-TTTCTTAATGTATTAGGTGATTCTGAG-39 (fragment B) PCR primers for confirmation of relative position of fragments AL and AS of W. scintillans mt genome Wsc/rrnS/F/1 59-AGTTTGTGTATTGCTGTCGTCAG-39 Wsc/rrnS/F/2 59-ACCATGTCAAGTCAAAATGCAGC-39 Wsc/rrnS/R/1 59-TACGACCGTGGTTAAATTGGTGA-39 Wsc/rrnS/R/2 59-ATGACGCATCGGATGAGAATAATA-39 Wsc/nad3/F/1 59-ACTAGCTCTGCCCTATTCCTAA-39 Wsc/nad3/F/2 59-CACGAGTGAAATCAAGGATCCT-39 Wsc/nad3/R/1 59-AATCCTTATCCCGTCTCGATCA-39 Wsc/nad3/R/2 59-CATGTCTTCCCAGTTTTGGTGT-39 PCR primers for confirmation of gene organization and primary sequence of W. scintillans mt genome Wsc/rrnS/F/1 (as above) (fragment 1) Wsc/rrnS/F/2 (as above) (fragment 1) Wsc/nad3/F/1 (as above) (fragment 1) Wsc/nad3/F/2 (as above) (fragment 1) Wsc/atp6/F/1 59-CTTCCTTTAGGTACGCCTAGTTT-39 (fragment 2) Wsc/atp6/F/2 59-CAGCTAACATTAGAGCAGGGCA-39 (fragment 2) Wsc/nad2/R/1 59-CTAGTTTTTGCCAGGTTAGTAGC-39 (fragment 2) Wsc/nad2/R/2 59-TCAGGATAGGGTGGTTGAGTATA-39 (fragment 2) Wsc/nad2/F/1 59-TATACTCAACCACCCTATCCTGA-39 (fragment 3) Wsc/nad5/R/1 59-TAGAGGAAGAAGTACTGAATTCGAT-39 (fragment 3) Wsc/nad5/R/2 59-TATTTCGGGGTGTGTGATGGTG-39 (fragment 3) Wsc/rrnL/F/1 59-TGAAGCTTATCCCTCATACGATTA-39 (fragment 4) Wsc/rrnL/F/2 59-CTATACAACGTTAACGCACATCTT-39 (fragment 4) Wsc/cox1/R/1 59-CTTTCAACAGCTGAAGAAGCTAAT-39 (fragment 4) Wsc/cox1/R/2 59-TAAATCCATGGGCAGTGACCAC-39 (fragment 4) Wsc/nad5/F 59-CTAAACCTAAACCATCTCACCC-39 (fragment 5) Wsc/nad4/R 59-TAAGGGTAGATGTGTGAGGCTT-39 (fragment 5) Wsc/nad4/F 59-TATAACCCAGCTTGTAGCCGCT-39 (fragment 6) Wsc/cob/R 59-TGTGGGTGAGTACTTCGTTATG-39 (fragment 6) Wsc/cob/F 59-TGGCCTCAAGGTAAAACATAACCTA-39 (fragment 7) Wsc/nad1/R 59-TATTAGGTTCGGTACCGTGCTGT-39 (fragment 7) Wsc/nad1/F 59-CTCGGTGAGTTTCAGCTACACA-39 (fragment 8) Wsc/rrnL/R 59-TGTTGCTTGCGGTACTGTAAAGG-39 (fragment 8) Wsc/nad2/F/2 59-GTACTTTGCTCACCCTCTCCTC-39 (fragment 9) Wsc/nad2/R/3 59-GAAGTAGATTAGACTAATGGATCTA-39 (fragment 9) PCR primers for synthesizing gene-specific probes for Southern hybridization Wsc_cox3_pF 59-CATACAATAAAAGTCTCCTTAGGTATGC-39 (probe for cox3) Wsc_cox3_pR 59-TGTGAAGTATATACCTAGGATAATGGTGAG-39 (probe for cox3) Wsc_nad3_pF 59-TAGTAAACTTCTTACTATTACTTATTGGCT-T-39 (probe for nad3) Wsc_nad3_pR 59-TGATTTCACTCGTGAAAAAGACC-39 (probe for nad3) Wsc_cox1_pF 59-TGGACATCCAGAAGTGTATATTTT-39 (probe for cox1) 59-TATTGGTGTGTTGTATTTAATAGGAGA-39 (probe for cox1) 59-TGGTAACCCTTACGTTGTCAAGT-39 (probe for rrnS) 59-TTGAAAATAAGGTATGAAAAACTAGGA-39 (probe for rrnS) 59-TACTTCCATACTAATTATTCTTTCCTTAAC-39 (probe for nad2) 59-TAATATAGTTAAAGAAGTAGATTAGACTAATGGA-39 (probe for nad2) 59-TACTAAACCATATTACACTCAAAACACG-39 (probe for cob) 59-CACTCCTCTACATATTAAACCTGAGTG-39 (probe for cob) NOTE.N A, G, C, or T. R A or G. Y T or C. W A or T. S G or C. Amplified fragments, using the primers, shown in figure 1 are indicated in parenthesis. Materials and Methods trn4ltrnTtrnSugacob(39), cob(59)nad6trnPnad1 Samples trnLuaatrnLuagrrnLtrnYtrnWtrnGtrnENCR cox3(59), and cox1(39)cox2trnDatp8atp6trnFtrnV O. vulgaris, T. pacificus, and W. scintillans were rrnStrnMtrnCtrnQNCRcox3(59), respectively. The bought at the Tsukiji Fishery Market, Tokyo, Japan. All underlined genes are encoded by the opposite strand. individuals were caught in the seas around Japan. The Together with five fragments, we identified all known identification of T. pacificus and W. scintillans was carried genes, but we also found duplicated long NCRs and six out by Dr. K. Tsuchiya at the Tokyo University of Marine structural genes. Fragments B and C seem to be Science and Technology. The identification of O. vulgaris neighbors at the ends of the cob gene because there are was confirmed by comparison of partial cox1 and cox3 no additional cob gene in fragments AS, AL, or D. sequences with published octopod sequences, including Therefore, the order of these five fragments might be that of O. vulgaris (Bonnaud, Boucher-Rodoni, and BCASDAL(B) or BCALDAS(B). ConfirmaMonnerot. 1997; Carlini, Young, and Vecchione 2001). tion of the gene arrangement for the duplicated regions DNA Isolation, PCR, Cloning, and Sequencing was determined by additional PCR analyses. The PCR primers specific for rrnS (on fragment D) and nad3 (on The DNA sequence determination strategies for W. fragment AS), listed in table 1, were synthesized. Fragscintillans, T. pacificus, and O. vulgaris mt genomes were ments were amplified with nad3-39 primer and rrnS-39 the same as those used for the Ciona savignyi mt genome primer by PCR, but no other combination of PCR (Yokobori, Watanabe, and Oshima 2003). As an example, primers gave amplified fragments. Therefore, the order a brief description of the methods used to determine the of the fragments is thought to be BCASDAL(B) sequence of the W. scintillans mt genome follows. First, (fig. 1). Furthermore, PCR products (0.5 to 5 kbs) parts of cox1, cox3, and cob were amplified by nested covering the entire W. scintillans mtDNA were amplified PCR, using total DNA as the template with EX-Taq DNA (fragments 1 to 9), cloned, and sequenced (fig. 1). The polymerase (TAKARA). The primers used for the sequences of the PCR primers were adapted from the first amplification of partial sequences of cox1, cox3, and cob version of the complete nucleotide sequence of W. are listed in table 1. Amplified fragments were cloned with scintillans mtDNA. The sequences of these primers are the TOPO TA cloning kit (Invitrogen) and sequenced listed in table 1. For the amplification of long fragments using a PRISM 3100 DNA autosequencer (Applied (longer than 3 kbps; fragments 1 to 4), fragments were Biosystems). For sequence reactions, BigDye Terminator subjected to nested long PCR. The PCR primers used for version 3.1 (Applied Biosystems) was used. Using the the nested PCR are listed in table 1. PCR conditions were sequence information, the remaining parts of the mt as described above. Amplified PCR products (fragments 1 genomes were amplified by nested long PCR (the primers to 8) were cloned as described above. Both strands of these used are listed in table 1) with LA Taq DNA polymerase fragments were determined by primer-walking. Fragment (TAKARA). Among all combinations of PCR primers, 9, once amplified, was purified with the QIAquick PCR four combinations of PCR primers, (A) cox3-39 and cox1- purification kit (QIAgen) according to the manufacturers 59, (B) cox1-39 and cob-39, (C) cob-59 and cox3-59, and protocol. The resulting purified PCR product was then (D) cox1-3. cox3-59 gave amplified fragments. The lengths directly sequenced. Similar sequencing strategies were of the PCR fragments are approximately 1 kbps and 2 kbps applied for the remaining T. pacificus and O. vulgaris mt for (A) (fragment AS and AL, respectively), approximately genomes (details are not shown). 6 kbps for (B) (fragment B), and approximately 4.5 kbps Identification of protein and rRNA genes was carried for (C) and (D) (fragments C and D). The amplified out by comparing them with counterparts from L. bleekeri fragments were cloned and sequenced. The two fragments (Tomita et al. 2002) and K. tunicata (Boore and Brown amplified with the primers specific for cox3-39 and 1994) mt genomes. Identification of tRNA genes was carcox1-59 have the cox3 gene at one end and the cox1 gene ried out manually, by making visual searches of the cloverat the other end. However, their gene organizations are leaf structures. The complete nucleotide sequences of W. cox3(39)trnAtrnNtrnInad3cox1(59) for fragment scintillans, O. vulgaris, and T. pacificus mt genomes were AS and cox3(39)trnKtrnRtrnS(gcu)nad2cox1(59) entered into the DDBJ/EMBL/GenBank DNA databases for fragment AL. The orders in fragments B, C, and D under the accession numbers AB086202, AB158363, are cox1(39)cox2trnDatp8atp6nad5trnHnad4 and AB158364, respectively. Southern Analysis of the W. scintillans mt Genome Total DNA of W. scintillans was prepared from either the liver or eggs of a single animal by QIAgen Genometip (QIAgen). W. scintillans total DNA was digested with Apa I, Bgl II, Nco I, Pst I, Sal I, and Xho I. Six genes, cox1, cox3, cob, nad2, nad3, and rrnS, were the targets of Southern hybridization. Labeled probes with DIG (Digoxigenin) (ca. 300 bp each) were synthesized with a DIGPCR probing kit (Roche Diagnostics). Sequences of PCR primers are as listed in table 1. Nondigested and digested W. scintillans total DNA, 0.5 lg each, was used for 0.6 % agarose gel electrophoresis in TAE, followed by alkali transfer of the DNA to nylon membrane Hybond N1 (Amersham Biosciences). Southern hybridization was performed with UltraHyb (Roche Diagnostics) as described by the manufacturer. A DIG Nucleic Acid Detection kit (Roche Diagnostics) was used for detection as described by the manufacturer. Phylogenetic Analyses All mt protein genes, except atp8, were used for phylogenetic analyses. The following complete nucleotide sequence entries were retrieved from GenBank: K. tunicata (U09810), L. bleekeri (AB029616), Inversidens japanensis (female type) (AB055625), Terebratulina retusa (AJ245743), Lumbricus terrestris (U24570), Platynereis dumerii (AF178678), Limulus polyphemus (AF216203), Artemia franciscana (X69067), Drosophila yakuba (X03240), Lithobius forficatus (AF309492), Homo sapiens (J01415), Balanoglossus carnosus (AF051097), Asterina pectinifera (D16387), and Metridium senile (AF000023). The amino acid sequences of each protein gene with its counterparts in O. vulgaris, T. pacificus, and W. scintillans were aligned by application of ClustalX (Thompson et al. 1997) using the default settings. The best-aligned regions were selected by GBLOCKS (Castresana 2000). The selected regions of all protein genes were then concatenated (2,592 sites). For maximum-likelihood (ML) analysis (siteby-site rate variations), Tree-Puzzle version 5.1 (Schmidt et al. 2002) was used under the conditions of the mtREV24 substitution model and one invariable and eight-class discrete gamma distribution model for site-by-site rate variations. In total, 16 % of sites were estimated to be invariable, and from the data, the shape parameter a of gamma distribution was estimated to be 0.92. In addition, FIG. 1.PCR amplification strategy for the W. scintillans mt genome. In the top row, the inferred gene organization of W. scintillans mt genome is shown. Circular genomes are presented linearly to ease comparison of gene organization. Thick lines are the first PCR fragments. Lines with black arrowheads at both ends are long PCR products. The names of the fragments (AL, AS, B, C, and D) are defined in the text. The PCR fragments (1 to 9), indicated by lines with white arrowheads at both ends, are PCR fragments for sequence confirmation as noted in the text. Each tRNA gene is indicated by the letter corresponding to the appropriate amino acid. L1, L2, S1, and S2 indicate trnL(uaa), trnL(uag), trnS(uga), and trnS(gcu), respectively. The protein and rRNA genes encoded by the opposite strand are shown by gray boxes. The tRNA genes encoded by the opposite strand are shown below the column. Long NCRs are indicated by black boxes. the ML tree without site-by-site rate variation was estimated using the PROTML routine in MOLPHY version 2.3b (Adachi and Hasegawa 1996a). The topology of the neighbor-joining (NJ) tree, which was constructed with the ML distance matrix estimated by PROTML (D option), was used as the initial tree for the ML tree search by the NNI search routine of PROTML. Furthermore, four cephalopods and K. tunicata were used for ML analysis (3,505 sites). A possible 15 unrooted tree topologies were then used for the ML analysis with PAML (Yang 1997) (eight-class discrete gamma distribution model) and with Tree-Puzzle (Schmidt et al. 2002) (one invariable and eight-class discrete gamma distribution model). well conserved (66.7% to 66.8%). The 59 part is conserved more than the central region and the 39 part of the NCRs between the T. pacificus and W. scintillans mt genomes. The L. bleekeri NCR is less similar to the W. scintillans and T. pacificus NCRs, but these three NCR sequences are easily aligned. The O. vulgaris NCR shares some characteristics in its primary sequences with squid NCRs, but alignment of the O. vulgaris NCR with squid NCRs are not easy. Near the 59 end of the NCRs, a short sequence, 59-TATATATAATAAACA-39, is conserved between T. pacificus and W. scintillans, and two similar sequences, 59-TGTATATAATACACG-39and 59-TGTATATAATATACA-39, are found near the 59 ends of the L. bleekeri and O. pacificus NCRs, respectively. At the 39 end half, all four cephalopod NCRs carry a C-rich track. These shared characteristics among cephalopod NCRs possibly contribute to the functions of the NCRs in replication and transcription initiation. However, further studies are required to determine this for certain. The O. vulgaris mt genome is 15,744 bp long. It encodes the standard set of metazoan mt genes, and the Southern Aanalysis for Confirmation of W. scintillans gene organization is similar to that of the polyplacophoran Gene Organization K. tunicata (Boore and Brown 1994); there are only two differences (translocation of trnD and inversion of trnP) To confirm gene duplication in the W. scintillans mt (fig. 2). genome, Southern analysis was performed. The predicted In contrast to the O. vulgaris mt genome, the W. lengths and locations of the restriction fragments of W. scintillans and T. pacificus mt genomes exhibit several scintillans mtDNA and the predicted hybridization patunusual features. The mt genomes are 20,091 bp long and terns are shown in figure 3A and B. 20,254 bp long, respectively, and both carry the standard Using the restriction enzymes that were cut once for sets of metazoan mt genes, but six structural genescox3, W. scintillans mtDNA (Apa I and Xho I) caused all probes cox1, cox2, trnD, atp8, and atp6and the longest derived from cox1, cox3, nad2, nad3, cob, and rrnS to noncoding region are duplicated (fig. 2). The duplication hybridize to the same band (fig. 3C). As predicted (fig. 3A patterns of these two genomes are not simple. The first and B), the specific probe for cox1, which might be duplicate copy contains a four-gene insertiontrnA, trnN, duplicated, hybridized to two distinct bands of W. trnI, and nad3between cox3 and cox1, whereas the scintillans DNA treated with either Bgl II, Nco I, or Pst second copy carries a different insertiontrnK, trnR, I (fig. 3C). Again, as predicted (fig. 3A and B), the cox3trnS(gcu), and nad2between cox3 and cox1. In addition, specific probe also hybridized to two distinct bands of W. the two duplicates are separated by the following genes: scintillans DNA treated with either Nco I, Pst I, or Spe I trnF, trnV, rrnS, trnM, trnC, and trnQ. Otherwise, the mt (fig. 3C). Conversely, the other four probes hybridized to genome structures of W. scintillans and T. pacificus are a single band (fig. 3C). All the hybridization patterns nearly identical; only the location of trnM differs between match the gene organization of the W. scintillans mt them. genome presented in figures 1 and 2A. Note that the W. The nucleotide compositions of O. vulgaris, T. scintillans DNA used for the Southern hybridization pacificus, and W. scintillans mt genomes are 41.2 % A, analysis was prepared from a different individual that 33.2 % T, 7.6 % G, and 17.6 % C for O. vulgaris; 38.4 % those used for sequence determination. In addition, we A, 34.2 % T, 9.9 % G, and 17.5 % C for T. pacificus; and could not find any additional bands that are not inferred 35.3 % A, 33.4 % T, 11.6 % G, and 19.2 % C for W. from the determined nucleotide sequence shown in figure sintillans, respectively. These values are similar to those 3C. Therefore, the observed gene duplication in the W. in the L. bleekeri mt genome (Tomita et al. 2002). The scintillans mt genome could be common for the W. nucleotide composition of NCRs of O. vulgaris, T. scintillans mt genome. Thus, the gene organizations of the pacificus, and W. scintillans mt genomes are 44.0 % A, W. scintillans and T. pacificus mt genomes shown in figure 37.6 % T, 4.6 % G, and 13.8 % C for O. vulgaris; 40.4 % A, 2 are suggested. 39.3 % T, 5.5 % G, and 14.8 % C for T. pacificus; and 39.0 % A, 35.7 % T, 9.1 % G, and 16.2 % C for W. sintillans, Concerted Evolution respectively. In all three cephalopod mt genomes, the NCRs are slightly richer in A and T than are other regions. The sequences of the duplicated regions are nearly O. vulgaris, T. pacificus, and W. scintillans mt NCRs identical within each species, but there are large differform stem-and-loop structures at the both ends (data not ences when comparing between W. scintillans and T. shown). In addition, within their central regions, one or pacificus (table 2). From the analyses of duplicated more stem-and-loop structures can also be formed. The noncoding regions in various metazoan mt genomes, it NCRs of T. pacificus and W. scintillans mt genomes are has been suggested that there are concerted evolution processes in metazoan mitochondria (cf. Kumazawa et al. 1998). Because the W. scintillans and T. pacificus mt genomes belonging to different families share gene duplication, and the duplicated regions in the same species have nearly identical sequences, it should be concluded that concerted evolution processes exist within cephalopod mt genomes. Duplicated units in the W. scintillans and T. pacificus mt genomes are not placed in tandem, meaning that a simple slippage model cannot explain their sequence homogeneity; however, inter/intramolecular recombination might provide an adequate explanation. In certain molluscan species, two types of mtDNA (male and female types) coexist within a single cell (e.g., Zouros et al. 1994). Some reports suggest the existence of intermolecular recombination (Ladoukakis and Zouros 2001), supporting our conclusion that recombination is the possible mechanism maintaining the homogeneity of the duplicated sequences in W. scintillans and T. pacificus mt genomes. Furthermore, recombination activity has also been reported from mammalian mitochondria (Thyagarajan, Padua, and Campbell 1996). To understand the effect of gene duplication on the evolutionary rate, the relative rate test (RRTree [RobinsonRechavi and Huchon 2000]) of each protein gene (amino acid sequence level) was performed. When the evolutionary rate is compared among O. vulgaris, L. bleekeri, T. pacificus, and W. scintillans (using K. tunicata as an outgroup), there are no significant differences in the evolutionary rate of any species pair (data not shown). When the evolutionary rate is compared among squid species (using O. vulgaris as an outgroup), only the hypothesis that T. pacificus and W. scintillans nad4 evolved at the same rate is statistically rejected (P , 0.01) (data not shown). These results suggest that neither the duplication of genes or concerted evolutionary process seriously affect the evolutionary rate of the genes themselves. Conversely, recent independent duplications in the W. scintillans and T. pacificus mt genomes cannot be rejected. This scenario would explain the occurrence of nearly identical sequences between duplicates in a single genome, but different sequences between different species duplicated regions. However, as we discuss later, for such a situation to be at all possible, at least two duplication events and several subsequent gene-loss events would have needed to take place independently, but in the same order, in both the T. pacificus and W. scintillans lineages. This extent of parallel evolution in the T. pacificus and W. scintillans mt genomes requires far more assumptions than the hypothesis that concerted evolution maintains homogeneity of the duplicated sequences within a species. Therefore, we believe that the model of ancestral gene duplication in T. pacificus and W. scintillans mt genomes followed by a concerted evolution process that homogenized the duplicated sequences is more likely than the model of recent and independent gene duplications in both T. pacificus and W. scintillans mt genomes. As shown in table 2, there are four differences between duplicates of the protein genes in the W. scintillans mt genome. These differences are found at the second position of codons. This means that the resultant proteins of these genes have different amino acid sequences. These differences can change the property of proteins originated from another copy of the duplicated genes. However, all the substituted positions are at the poorly conserved regions, suggesting that their effects on the proteins might not be serious. In addition, when the sequenced PCR clones were checked, nucleotide variations are found at different sites within the duplicated genes. The minor nucleotides of PCR clones at various positions in the one of duplicates are identical to the nucleotides at the corresponding positions in the other duplicates. The observed sequence differences between the copies of duplicated regions might be because of polymorphism, although the possibility of PCR errors cannot be ignored. Does this situation suggest relaxed functional constraints in the duplicated genes? To address this question, further analyses of the function of these duplicated genes are needed. Evolution of Cephalopod mt Gene Arrangement Within both this study and our previous study (Tomita et al. 2002), we have reported four complete cephalopod mt genome nucleotide sequences: L. bleekeri, W. scintillans, T. pacificus, and O. vulgaris. The O. vulgaris mt genome appears to retain a greater level of ancestral gene organization than the other species because the O. vulgaris mt gene organization is nearly identical to that of the polyplacophoran K. tunicata (fig. 2). When the O. vulgaris and K. tunicata mt genomes are compared, one ancestral gene location and one derived gene location can be distinguished. In the case of the O. vulgaris mt genome, the location of trnD is the ancestral feature, whereas the direction of trnP is the derived feature. Conversely, in the K. tunicata mt genome, the direction of trnP is the ancestral feature, whereas the location of trnD is the derived feature. As Tomita et al. (2002) have discussed, a trnD gene located between cox2 and atp8 is found in the Littorina saxatilis (Mollusca and Gastropoda) mt genome (Wilding, Mill, and Grahame 1999) and various other metazoan mt genomes such as several arthropod mt genomes (e.g., Clary and Wolstenholme 1985; Lavrov, Boore, and Brown 2000). This finding suggests that the location of trnD in the O. vulgaris mt genome is the ancestral feature for that in the K. tunicata mt genome. On the other hand, the same direction of trnP in K. tunicata, rather than that observed in O. vulgaris, is found in these other mt genomes (e.g., Clary and Wolstenholme 1985; Lavrov, Boore, Brown 2000), suggesting that it is the ancestral feature for the direction observed in the O. vulgaris mt genome. Using a model in which slippage during replication causes gene duplication, after which random gene loss causes changes in gene organization, differences in the gene organization of K. tunicata and O. vulgaris mt genomes can be explained (fig. 4A). A tRNA-like cloverleaf structure (anticodon AGA) is found between cox2 and atp8 in the K. tunicata mt genome (Boore and Brown 1994). However, we found another possible tRNAlike structure in this region (positions 2810 to 2871, overlapping the tRNASer-like structure [antiocodon AGA]). The sequence of the cloverleaf structure is very FIG. 3 (Continued) similar to that of trnD (78% identity), although the anticodon loop of the former structure is composed of eight nucleotides (59-TTATTTAA-39) instead of seven nucleotides. Thus, the cloverleaf structure between cox2 and atp8 in the K. tunicata mt genome seems to be a pseudogene of trnD. Pseudogenes of trnD could be created during the rearrangement process, as in the case of the L. bleekeri mt genome, in which the creation of the trnH pseudogene by a similar means has been reported (Sasuga et al. 1999). Both W. scintillans and T. pacificus mt gene organizations could have originated from an Octopus-type mt gene organization, through two gene-duplication events followed by the loss of one of the duplicated genes (fig. 4B). This process appears to be a typical sequence of events in the evolution of gene organization. The problem is that the W. scintillans and T. pacificus mt genomes carry six duplicated genes. Because there are only simple sequence differences between the duplicated regions, including those of the long NCRs in which the transcription initiation site can be located (table 2), both duplicated genes could be functional. As discussed above (see also figure 4B), many processes (duplications and loss of genes and NCR) are needed to create W. scintillans and T. pacificus mt gene organizations from Octopus-type mt gene organization. However, the position of trnM is the only difference between the W. scintillans and T. pacificus mt genomes. If, as the result of parallel evolution of W. scintillans and T. pacificus mt genomes, the gene duplication and gene organization of W. scintillans and T. pacificus mt genomes were created independently, not just one but many independent, identical gene losses and gene rearrangements would have been needed, although we do not have to presume the existence of a concerted evolution process. For example, parallel evolution on mt gene organization has been suggested in bird mt genomes (Mindell, Sorenson, and Dimcheff 1998). However, we prefer the ancient duplication model rather than the parallel evolution model, because the former requires fewer gene rearrangement/gene loss events than the latter. Why Are Duplicated Genes Maintained in the W. scintillans and T. pacificus mt Genomes? Several researchers have pointed out that an mt genome with a duplicated functional long NCR has several advantages over one with only a single NCR (e.g., Kumazawa et al. 1998). Because the long NCR contains the initiation region for replication, mt genomes with multiple long NCRs can be replicated from multiple points, whereas those with only a single long NCR can only be replicated from one point. An mt genome with multiple replication origins might, therefore, be capable of faster replication, provided the duplicated region is not too long. In mitochondria, there are multiple copies of mtDNA, and there must be a certain degree of selection pressure between individual mtDNAs. Therefore, mt genomes with multiple replication origins might be expected to proliferate at the expense of those with only a single replication origin. Are there any advantages for maintaining duplicate copies of functional structural genes in W. scintillans and T. pacificus mt genomes? Among metazoan mt genomes for which complete nucleotide sequences have been published, there is only one (Venerupis philippinarum FIG. 3.Southern analysis of the W. scintillans mt genome. (A) Locations of probes (vertical arrows) and restriction fragments of the W. scintillans mt genome. (B) Expected results of the Southern hybridization experiment on the W. scintillans mt genome. Abbreviations for each gene are as follows. III: cox3; I: cox1; 3: nad3; 2: nad2; B: cob; and S: rrnS. Abbreviations for each restriction enzyme are as follows. C: control (noncut); A: Apa I; B: Bgl II; N: Nco I; P: Pst I; S: Spe I; and X: Xho I. (C) Southern hybridization. The results of hybridizations of various probes with W. scintillans total DNA are shown. The noncut DNA (C), Apa Itreated DNA (A), Bgl IItreated DNA (B), Nco Itreated DNA (N), Pst Itreated DNA (P), Spe Itreated DNA (S), and Xho Itreated DNA (X) were loaded for each gel. In the lower column for each hybridization image, the name of the probe is shown. All indicates that a mixture of probes for all six genes was used for hybridization. Nucleotide Sequence Similarities (%) Between Duplicated Regions of W. scintillans and T. pacificus mt Genomes Wsc Copy 1 vs. 2 Tpa Copy 1 vs. 2 [AB065375]) that encodes duplicated protein genes (cox2). However, in this genome, the two cox2 genes have very different sequences. As we have shown above, this is not the case for the duplicate genes in W. scintillans and T. pacificus mt genomes. Consider that each subunit of NADH dehydrogenase, for which the gene is encoded by the mt genome, is equally synthesized and that the rate of translation of each subunit of NADH dehydrogenese gene is also equal. If the two assumptions of equal transcription rate of genes and equal translation rate of NADH dehydrogenase genes are accepted, the duplicated genescox3, cox1, cox2, trnD, atp8, and atp6should be transcribed twice as frequently as nad2, nad3, and other NADH dehydrogenase subunit genes. The duplicated protein genes in W. scintillans and T. pacificus mt genomes encode subunits of complex IV and complex V but not complex I and complex III. The transcription frequency of rRNA molecules is differently controlled from those of mRNA and tRNA molecules in vertebrate mitochondria (e.g., Clayton 1992). In addition to the transcript covering the entire genome, another transcript containing only srRNA, lrRNA, tRNAPhe, and tRNAVal is also synthesized. Thus, in vertebrate mitochondria, the copy numbers of rRNAs are generally maintained at a higher level than those of tRNA and mRNA (Clayton 1992). A similar situation might exist for rRNA expression in the O. vulgaris mt genome. In the cases of W. scintillans and T. pacificus mt genomes, two rRNA genes are encoded at the distinct positions. However, both rrnL and rrnS are flanked at both ends by tRNA gene(s), and the long NCRs are located upstream from the tRNA genes, which are next to the 59 ends of rrnL and rrnS. Because the long NCRs have nearly identical sequences, the number of transcripts of rrnL and rrnS can be controlled in a similar manner. Molecular Phylogenetic Analysis Quartet-puzzling (QP) analysis with Tree-Puzzle (Schmidt et al. 2002) under the invariable and gamma model (mtREV model) shows monophyly of oegopsids (W. scintillans and T. pacificus) (fig. 5). L. bleekeri, a representative of the myopsids in this tree, is the sister group of the oegopsid squids, and O. vulgaris is the sister taxon of the decapods in this tree. In this tree, cephalopods form a monophyletic group with K. tunicata (Polypracophora) and Inversidens (Bivalvia). However, statistical support for the monophyly of the Mollusca is not sufficiently high. On the other hand, Terebratulina retusa, a representative of the Brachiopoda, forms a group with the annelid species in this tree, although the statistical support for this group is not particularly high. The monophyly of three groups (Mollusca, Annelida, and Brachiopoda) of the Lophotrochozoa is well supported in this tree, as in our previous analysis (Tomita et al. 2002), but the relationship among the major groups of Mollusca, Annelida, and Brachiopoda could not be resolved from the present data. When we performed a ML analysis on five molluscan speciesK. tunicata (outgroup), O. vulgaris, L. bleekeri, T. pacificus, and W. scintillansusing CODEML in PAML (Yang 1997) (eight-class discrete gamma distribution model), we found that W. scintillans and T. pacificus are monophyletic, and L. bleekeri is their sister taxon in the ML tree (lnL 224077.52). Next, the tree in which L. bleekeri and T. pacificus are monophyletic is the second tree (difference of lnL from best tree with SE 25.49 6 10.84), and this tree cannot be rejected on the basis of BP values, the Kishino-Hasegawa test (Kishino and Hasegawa 1989), the Shimodaira-Hasegawa test (Shimodaira and Hasegawa 1999), or differences in log-likelihood. On the other hand, in the ML tree constructed with Tree-Puzzle (one FIG. 4.Possible pathways for how W. scintillans and T. pacificus mt gene organizations originated from the ancestral gene organizations found in O. vulgaris and K. tunicata mt genomes. (A) Between O. vulgaris and K. tunicata mt genomes. (B) From the O. vulgaris type mt genome to the W. scintillans mt genome. In both models, one of two ends of the duplicated unit is under the constraint of being in the noncoding region. The a and b mean of the seven-tRNA gene cluster appears upstream of NCR and the five-tRNA gene cluster appears between cox3 and nad3, respectively, in O. vulgaris and K. tunicata mt genomes and their corresponding regions in other mt genomes such as W. scintillans mt genome. Abbreviations of tRNA genes are as in figure 1. Abbreviations of protein and rRNA genes are as follows: C1 to C3 for cox1 to 3, CB for cob, A6 and A8 for atp6 and 8, N1 to N6 and NL for nad1 to 6 and nad4L, and LR and SR for rrnL and rrnS. FIG. 5.ML analysis of 17 metazoans based on the amino acid sequences of concatenated 12-mt protein genes. The ML tree inferred with Tree-Puzzle is shown. Support for the internal branches of the quartet puzzling tree topology (%) (left) and local bootstrap probability of the internal branches of ML tree estimated with PROTML (%) (right) are shown at the node. The log-likelihood of the tree estimated with Tree-Puzzle is 249482.91. The log-likelihood of this tree (6SE) estimated with PROTML is 252542.78 (6696.28). The topologies of the trees obtained with these two methods are identical. invariable and eight-class discrete gamma distribution model), L. bleekeri and T. pacificus appear as the sister taxa (lnL 223817.07). The best tree chosen by CODEML is the second tree (difference of lnL from best tree with SE 12.08 6 9.65), and it cannot be rejected. The relationship among W. scintillans, T. pacificus, and L. bleekeri could not be resolved by our analyses, although a closer relationship between W. scintillans and T. pacificus is suggested by the mt genome structures. This finding, in turn, suggests that the squid radiation has occurred over a rather short period. In addition, the monophyly of the Mollusca could not be supported. The radiation of the Mollusca, Annelida, and Brachiopoda might also have occurred over a short period. The T. pacificus and W. scintillans mt genomes are the first examples of metazoan mt genomes that have been found to have stably carried duplicated structural genes over a long period. Together with the L. bleekeri mt genome (Tomita et al. 2002), which carries a triplicated NCR with nearly identical sequences, the cephalopod, at least squid, mitochondria are concluded to have certain concerted evolutionary processes. Analyses of other cephalopod mt genome structures will tell us how complicated squid mt genomes really are. On the other hand, how such mt genomes with duplicated noncoding regions and structural genes maintain their functions, such as replication and transcription, is an unresolved but important issue. The means of replication, transcription, and other processes within cephalopod mt genomes provide an interesting target for further studies. University of Pharmacy and Life Science for his valuable comments. This work was supported by grants to T.O. and S.Y. from Ministry of Education, Culture, Sports, Science and Technology, Japan. Naruya Saitou, Associate Editor


This is a preview of a remote PDF: http://mbe.oxfordjournals.org/content/21/11/2034.full.pdf

Shin-ichi Yokobori, Naoya Fukuda, Mitsue Nakamura, Tomoko Aoyama, Tairo Oshima. Long-Term Conservation of Six Duplicated Structural Genes in Cephalopod Mitochondrial Genomes, Molecular Biology and Evolution, 2004, 2034-2046, DOI: 10.1093/molbev/msh227