Multiple pathogenic and benign genomic rearrangements occur at a 35 kb duplication involving the NEMO and LAGE2 genes

Human Molecular Genetics, Oct 2001

The X-linked dominant and male-lethal disorder incontinentia pigmenti (IP) is caused by mutations in a gene called NEMO (IKK-γ). We recently reported the structure of NEMO and demonstrated that most IP patients carry an identical deletion that arises due to misalignment between repeats. Affected male abortuses with the IP deletion had provided clues that a second, incomplete copy of NEMO was present in the genome. We have now identified clones containing this truncated copy (ΔNEMO) and incorporated them into a previously constructed physical contig in distal Xq28. ΔNEMO maps 22 kb distal to NEMO and only contains exons 3–10, confirming our proposed model. A sequence of 26 kb 3′ of the NEMO coding sequence is also present in the same position relative to the ΔNEMO locus, bringing the total length of the duplication to 35.5 kb. The LAGE2 gene is also located within this duplicated region, and a similar but unique LAGE1 gene is located just distal to the duplicated loci. Mapping and sequence information indicated that the duplicated regions are in opposite orientation. Analysis of the great apes suggested that the NEMO/LAGE2 duplication occurred after divergence of the lineage leading to present day humans, chimpanzees and gorillas, ∼10–15 million years ago. Intriguingly, despite this substantial evolutionary history, only 22 single nucleotide differences exist between the two copies over the entire 35.5 kb, making the duplications >99% identical. This high sequence identity and the inverted orientations of the two copies, along with duplications of smaller internal sections within each copy, predispose this region to various genomic alterations. We detected four rearrangements that involved NEMO, ΔNEMO or LAGE1 and LAGE2. The high sequence similarity between the two NEMO/LAGE2 copies may be due to frequent gene conversion, as we have detected evidence of sequence transfer between them. Together, these data describe an unusual and complex genomic region that is susceptible to various types of pathogenic and polymorphic rearrangements, including the recurrent lethal deletion associated with IP.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://hmg.oxfordjournals.org/content/10/22/2557.full.pdf

Multiple pathogenic and benign genomic rearrangements occur at a 35 kb duplication involving the NEMO and LAGE2 genes

Swaroop Aradhya 3 Tiziana Bardaro 2 3 Petra Galgczy 1 3 Takanori Yamagata 0 3 Teresa Esposito 2 3 Henry Patlan 3 Alfredo Ciccodicola 2 3 Arnold Munnich 3 5 Sue Kenwrick 3 4 Matthias Platzer 1 3 Michele D'Urso 2 3 David L. Nelson 3 0 Department of Pediatrics, Jichi Medical School , 3311-1 Yakushiji, Minamikawachi-machi, Tochigi 329-0433 , Japan 1 Department of Genome Analysis, Institute of Molecular Biotechnology , Beutenbergstrasse 11, 07445 Jena , Germany 2 International Institute of Genetics and Biophysics (IIGB) , Via G. Marconi 10, 80125 Naples , Italy 3 Department of Molecular and Human Genetics, Baylor College of Medicine , One Baylor Plaza 902E, Houston, TX 77030 , USA 4 Wellcome Trust Centre for Molecular Mechanisms of Disease and University of Cambridge Department of Medicine, Addenbrooke's Hospital , Hills Road, Cambridge CB2 2XY , UK 5 Department of Genetics, Unite des Recherches sur les Handicaps Genetiques de l'Enfant INSERM-393, Hopital Necker-Enfants Malades , 75015 Paris , France - The X-linked dominant and male-lethal disorder incontinentia pigmenti (IP) is caused by mutations in a gene called NEMO (IKK-). We recently reported the structure of NEMO and demonstrated that most IP patients carry an identical deletion that arises due to misalignment between repeats. Affected male abortuses with the IP deletion had provided clues that a second, incomplete copy of NEMO was present in the genome. We have now identified clones containing this truncated copy ( NEMO) and incorporated them into a previously constructed physical contig in distal Xq28. NEMO maps 22 kb distal to NEMO and only contains exons 310, confirming our proposed model. A sequence of 26 kb 3 of the NEMO coding sequence is also present in the same position relative to the NEMO locus, bringing the total length of the duplication to 35.5 kb. The LAGE2 gene is also located within this duplicated region, and a similar but unique LAGE1 gene is located just distal to the duplicated loci. Mapping and sequence information indicated that the duplicated regions are in opposite orientation. Analysis of the great apes suggested that the NEMO/LAGE2 duplication occurred after divergence of the lineage leading to present day humans, chimpanzees and gorillas, 1015 million years ago. Intriguingly, despite this substantial evolutionary history, only 22 single nucleotide differences exist between the two copies over the entire 35.5 kb, making the duplications >99% identical. This high sequence identity and the inverted orientations of the two copies, along with duplications of smaller internal sections within each copy, predispose this region to various genomic alterations. We detected four rearrangements that involved NEMO, NEMO or LAGE1 and LAGE2. The high sequence similarity between the two NEMO/LAGE2 copies may be due to frequent gene conversion, as we have detected evidence of sequence transfer between them. Together, these data describe an unusual and complex genomic region that is susceptible to various types of pathogenic and polymorphic rearrangements, including the recurrent lethal deletion associated with IP. Mutations in NEMO (IKBKG, IKK-) cause the X-linked dominant disorder, incontinentia pigmenti (IP) (1,2). This disorder is typically lethal in male individuals but female patients survive because cells expressing the mutant X chromosome are selectively eliminated. Thus, skewed X-inactivation is a characteristic of this disorder (3,4). As a regulatory component of IB kinase, NEMO is responsible for downstream activation of the NF-B transcription factor. By inducing the transcription of various target genes, the NF-B signaling pathway regulates immune and inflammatory reactions and prevents apoptosis (5,6). Disruption of NEMO or NF-B renders cells susceptible to apoptosis, leading to the IP-associated male lethality and skewed X-inactivation in female patients (2). Nearly 7080% of IP mutations are accounted for by an identical deletion within NEMO, which eliminates exons 410 (7). This mutation arises due to misalignment between two identical MER67B *To whom correspondence should be addressed. Tel: +1 713 798 4787; Fax: +1 713 798 5386; Email: +AF277315, AL596249 and AJ271718 The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors sequences (termed int3h repeats); one copy is located in intron 3 and another 4 kb distal to the last exon of NEMO. When the recurrent IP deletion was first identified due to an aberrant fragment on a Southern blot, fragments of normal size were also present (2). This led us to propose that a second copy of NEMO ( NEMO) existed in the genome. In addition, PCR analysis of DNA samples from spontaneous male abortuses with the IP deletion yielded the expected amplification products from exon 2 to 3 and exon 3 to 4, but failed to amplify from exon 2 to 4. These observations supported a model that the second copy of NEMO was truncated, lacking the first four exons. The human genome contains numerous examples of gene duplications, several of which are involved in genomic disorders (8). Thus, it was conceivable that rearrangements could occur between the two NEMO copies and that such events may have a role in the genetics underlying IP or another human disease due to disruption of genes between them. Rearrangements would be especially likely if the two NEMO copies share significant homology, and a preliminary analysis of exons suggested complete identity between NEMO and NEMO. We recently constructed a high-density bacterial- and P1-artificial chromosome (BAC and PAC) contig to study the region between G6PD and Xqter (9). After another group mapped NEMO to Xq28, we sequenced the entire gene from a BAC clone in the contig and showed that it lies head-to-head with G6PD and is transcribed in the centromeric to telomeric direction (Fig. 1A and B) (2,10). The 23 kb NEMO gene contains 12 exons with three alternative primary exons that independently splice into exon 2, where the initiating ATG codon lies (GenBank accession no. AJ271718). In our initial BAC/PAC contig, a gap existed just distal to NEMO and efforts to close it with flanking probes had failed repeatedly. When the idea of a second copy of NEMO was proposed, we decided to search the gap region since duplicated copies of genes are often located close to the parent copy, as exemplified by the IDS gene in Xq28 (11). This report describes how we identified NEMO, when and how it originated, its current structure and the homology it shares with NEMO. The duplication boundaries were cloned and the entire region containing NEMO and NEMO was sequenced. Interestingly, NEMO was part of a larger duplication that originated during evolution of the great apes. The sequence, structure and evolution of this duplication provide significant insight into how the human genome evolves through mechanisms of structural alteration as well as sequence preservation. Isolation of clones containing NEMO and NEMO We previously sequenced the NEMO gene (GenBank accession no. AJ271718) from BAC clone RP11-211L10 (Fig. 1A and B), including a 5 kb region downstream that contained the distal int3h copy, involved in mediating the recurrent IP deletion (2). When the existence of a second copy of NEMO was proposed, there were no clues to its genomic location. BAC and PAC clones containing NEMO had been previously isolated from the RP11 male BAC library, and the RP6 female and RP5 male PAC libraries. The clones from the RP11 and RP6 libraries were part of a larger physical contig (9), in which a gap existed between NEMO and DKC1. Screening the RP11 BAC library with the NEMO cDNA identified clones that contained NEMO and that closed the gap (Fig. 1A). Multiple RP11 BAC clones (66O13, 196H18 and 515D14) contained NEMO, and only one clone (103M23) included both NEMO and NEMO. The two RP5 clones (865E18 and 1087L19) were isolated separately for large-scale sequencing of the NEMO region and together they covered both NEMO and NEMO (12). Cloning the duplication boundaries The model that NEMO contained only exons 310 suggested that the 5 boundary of the duplication would be within intron 2 (Fig. 1B and C). Thus, the 5 breakpoint was detected by hybridizing the NEMO cDNA probe on a HindIII digest of clones containing NEMO, NEMO or both (Fig. 1D). Examination of the 5 boundary clone from NEMO showed that one end of the clone was within intron 7, as expected, and the other end was upstream of a gene called LAGE1 (GenBank accession no. AJ223093) (Fig. 1C). This indicated that LAGE1 was telomeric to NEMO, and that NEMO was in an inverse orientation relative to NEMO (Fig. 1A and C). The 3 boundary was more difficult to detect because initially there was no sequence information 3 of NEMO. Therefore, we walked distally from NEMO by cloning overlapping fragments and using them as probes to identify the 3 duplication breakpoint. The 3 boundary was in a 17 kb SpeI band in BAC clone RP11-211L10, and in a 28 kb SpeI fragment in clones containing NEMO (Fig. 1D). Restriction analysis of clones from the probe walking experiments suggested a length of 35 kb for the duplication. We concurrently sequenced RP5 clones 865E18 and 1087L19 (GenBank accession no. AF277315), and RP11 clones 211L10 and 196H18 (GenBank accession no. AL596249). The sequence in this region was initially misassembled with the NEMO sequence overlapping that of NEMO and leaving a gap at the NEMO locus. Once this assembly was rectified, the sequence data confirmed the boundary cloning results and the opposite orientations of the NEMO copies. The sequence also revealed that the 5 boundaries of both copies were located near Alu elements and that the 3 boundaries were within LTR repeats. Since there was initially a gap at the NEMO locus in our physical contig, we suspected that this region might be unstable. Therefore, to ensure that the clones used in elucidating the structures and sequence of this region were faithful to the genome, and had not rearranged, a KpnI digest was performed to detect two fragments that spanned the region between the two NEMO/LAGE2 copies. Analysis of BAC clones and normal human genomic DNA showed the expected fragment sizes in all samples with the SA63F/SA64R probe (Table 1), indicating that the clones were intact (data not shown). However, it was interesting that clones containing either NEMO or NEMO terminated in the same region, near the 66O13F/R marker, suggesting that this region may indeed be unstable (Fig. 1A). The telomeric and centromeric end-sequences of RP11 clones 211L10 and 66O13, respectively, were located very close to each other between the duplicated regions. CTTGGCACATCACTTATCAG GGATGGAGTATTGCAGCCTCTCGCC GGTAGCTGGAACTGCATGTCTGGTGG Map location of LAGE1 Unique forward primer for NEMO Unique forward primer for NEMO Probe for NEMO/ NEMO intron 3 Structure and sequence of duplicated region A BLAST search with the duplicated sequence identified the 3 kb LAGE2 gene (GenBank accession no. AJ275978). Hence, exons 310 of NEMO and the entire LAGE2 gene were both part of the duplication (Fig. 1C). The two copies derived from the duplication are hereafter termed NEMO/LAGE2 copies. In each NEMO/LAGE2 copy, the NEMO sequence occupied only 26% and LAGE2 accounted for 8%; the remaining 66% of sequence consisted of non-coding DNA, repeats and regulatory regions. Sequence comparison of the two copies revealed near complete conservation. Examining the coding regions of NEMO and LAGE2 between the two copies failed to reveal variations. However, outside the coding regions, 22 single nucleotide differences and two complex variations were found (Table 2). The complex polymorphisms included the presence or absence of multiple nucleotides within repeats of single- or 10-base units. Thus, the degree of sequence identity between the NEMO/LAGE2 copies is >99%. Eight of the 22 single nucleotide differences were found either in RP5 clones or in RP11 clones but not in both libraries. Hence, these single nucleotide differences might be polymorphisms between individuals rather than between NEMO and NEMO since the two libraries were prepared from different individuals. Eleven of the 22 single nucleotide differences were also within intronic or 3-untranslated regions (3-UTRs) of the NEMO sequence (Table 2). The duplications spanned a length of 35 470 bases, with the 3 duplication junction located 26 306 bases from NEMO exon 10. The addition polymorphisms (nos 1, 2, 4, 6, 8, 9, 17, 19 and 23; Table 1) increased the length of the duplication to 35 518 bases. An intervening segment of 21 761 bases between the two copies did not appear to contain any genes. Southern blot analysis of NEMO and LAGE genes in primates DNA samples from humans and other great apes were analyzed by Southern blotting with probes for the NEMO and LAGE genes. All samples, except orangutan, showed the two expected bands that represent NEMO and NEMO on a Southern blot hybridized with a NEMO exon 3 fragment (Fig. 1E). Orangutan only exhibited a fragment representing NEMO but not one corresponding to NEMO. A second probe immediately telomeric to NEMO also confirmed the expected band size in all samples except orangutan, which showed a smaller band (data not shown). A LAGE probe designed to detect both copies of LAGE2 and the single copy of LAGE1 hybridized the expected fragments in human, bonobo chimpanzee and gorilla (Fig. 1E). The common chimpanzee had a normal LAGE2-containing fragment but showed a smaller band for LAGE1. Orangutan did not exhibit a band corresponding to LAGE2. Sequence exchange between the NEMO/LAGE2 copies Since the NEMO/LAGE2 copies are in opposite orientation, we hypothesized that inversions might be responsible for their Verify NEMO int3h deletion Probe for KpnI fragments between NEMO and NEMO Verify LAGE1LAGE2A inversion Prepare probe telomeric to NEMO Hybridize LAGE1 and LAGE2 genes Isolate clones with NEMO/ NEMO Exon 3 probe for NEMO/ NEMO GGTTCAGCCCTC G/A AGGCCTGCTTGC TTAGGAGGCATT C/T TGGGGGCCCCGA CCCAGCACAGTA A/G GCGGTCAAGGTG GCACTTGGGGCA G/C CCAGCAGGGCAG CCCCTTCCCCTG A/G CTTCCAGGTCTC GGCCGCACCGCA G/T GGTCTGTGGTTC CACTGGGGCTCT /+ (T) AGGGCTGGCCTT ATGCCGTGGTAG C/T GGCGGCTCCTGG CCCGCCTGCCTA G/A CCCAGGATGAAG GAGCTGGGTGGC A/C GCTCTTCCTCCC CGACCCGCCCGC T/C GCTGTGCCCTGG CACTGCAGCCTT C/G ACCTCCTGAGCT TTACTGCTTTGA C/G TTTGGAGTCGTC TCCCCAGCACCC A/G GGCCTTCCTTCC GACCTTTCCCCT C/T CTTCAAGCCAGG TCACTGCAACTT A/C CGCCTCCAGGGT GGATGGGGCGTG G/A GATGACGGTTCG GCAATGCTCTTA T/C GGCAGTGCCCCACCC AGGAATAATGGC C/T CTTCCTGCCGGC TAACACCAGATG C/T GGACTAGTGTGG CGAGGGAGTGGA A/G TAAGGTGGGAAT AATTGGATTCGG C/T CAACCCTAGGCA Base change with flanking sequence Single base differences between NEMO/LAGE2 copies Location within duplicationb Complex polymorphisms between NEMO/LAGE2 copies GTGTGTGTGTGT /+ (GTGT) ATTTTTTTTTTTTTTTTT /+ (T) GAGACAGAGTTTTGCTCTTCT CTTCTTTCCCTC /+ (TCTTCGTTCCTTCCTCCCTTCCTTCCTTCCCTCCTTC CTTTT) TTCCTTATTCCTTCCTTTCCT Refers to bases in GenBank entry AF277315. aIndicates that polymorphism may represent difference between two individuals rather than between the NEMO/LAGE2 copies because the base alteration was found in a clone from one genomic library (RP5) but not in another clone from a different library (RP11). All of these base substitutions were found in the telomeric copy, except the 23 complex polymorphism, which was in the centromeric copy. bLocation indicates position where base alteration was discovered; N/ N, NEMO or NEMO; intergenic, between NEMO coding sequence and the adjacent LAGE2 gene sequence in each copy. cDrdI polymorphism used to test for sequence transfer between NEMO/LAGE2 copies (Fig. 2). sequence homogeneity. Alternatively, gene conversion and recombination were considered contributing factors. To detect evidence for sequence exchange, we used a single nucleotide difference between the two copies (Table 1; NEMO intron 4), which created a DrdI site (Fig. 2). Analysis of RP5 and RP11 clones showed that NEMO contained the DrdI site and NEMO did not. We hypothesized that if sequence exchange occurred between NEMO and NEMO close to the 5 duplication boundary, the DrdI site should shift back and forth between them. Moreover, observation of the same allele (presence or absence of the DrdI site) in both NEMO and NEMO at a high frequency would support the case for gene conversion. Using genomic DNA from 10 normal male controls, we amplified fragments from both NEMO and NEMO by PCR between intron 5 and unique sequences outside the NEMO/LAGE2 copies. Digestion of these products with DrdI yielded three combinations (Fig. 2). Six samples had the DrdI polymorphism in both NEMO and NEMO, one sample lacked the DrdI site in both copies, and three samples had the DrdI site in NEMO but not in NEMO. We never observed the DrdI site in NEMO only. This observation of the DrdI polymorphism in various combinations within NEMO and NEMO indicated that sequence exchange occurred between them. NEMO internal (int3h-mediated) deletions The most common mutation in IP patients is an int3h-mediated deletion within NEMO (2,7). Since NEMO also contains two copies of int3h, we examined control individuals in an effort to find int3h deletions in NEMO; these deletions were predicted to be non-pathogenic, and thus polymorphic, since NEMO seemed to lack an obvious promoter and was considered non-functional. N/ N intron 3 N/ N intron 4 N/ N intron 4 N/ N intron 4 N/ N intron 4 N/ N intron 5 N/ N intron 6 N/ N intron 9 N/ N exon 10 3-UTR N/ N exon 10 3-UTR N/ N exon 10 3-UTR G-BslI, BsaJI; A-DrdIc (T)-BfaI; +(T)-DdeI, MwoI C-AciI, EaeI, HaeIII; A-TseI A-TseI, AluI; C-EaeI, HaeIII A-BstNI, PspGI; G-XmaI, HpaII C-HaeIII; T-MboI, EarI, SapI A probe (SA54F/SA55R) was designed to look for alterations of a normal 13.8 kb HindIII fragment on a Southern blot to an 10 kb fragment, indicative of a NEMOint3h deletion. Interestingly, analysis of 53 normal individuals (98 total X chromosomes) did not reveal this deletion (data not shown). However, while testing IP patients for the recurrent NEMOint3h deletion with an intron 3 probe, we observed an extra 3.1 kb EcoRI band in DNA from two patients (XL203-01 and XL306-02), in addition to the aberrant 2.7 kb band representing the IP deletion (Fig. 3A). A third patient, XL233-01, also showed the 3.1 kb band but had a point mutation in NEMO instead of the IP deletion (7). Since the intron 3 probe detected both NEMO and NEMO, and because an int3h deletion in NEMO was expected to produce a 3.1 kb EcoRI fragment, we tried long-range PCR analysis on DNA from all three patients using primers SA65F and NEMO3-R1 (Fig. 3B). Successful PCR amplification across the deletion junction and subsequent sequence verification of the PCR product (data not shown) confirmed that the 3.1 kb fragment was due to a NEMOint3h deletion, identical to the recurrent IP deletion in NEMO. Examination of over 100 unrelated female IP patients (more than 200 X chromosomes) revealed this rearrangement in only the three families mentioned above, emphasizing that it was rare. Notably, the aberrant band segregated with disease in all three IP families (data not shown), and there was no clinical variation between IP patients carrying only the NEMOint3h deletion and those with both the NEMOint3h and NEMOint3h deletions. The int3h duplication Given that the int3h repeats are oriented in the same direction and deletions are observed between them, we expected to find the reciprocal int3h duplications as well. An EcoRI digest probed with NEMO intron 3 revealed an extra 26 kb band in one IP patient (XL206-03) who also demonstrated the 2.7 kb aberrant band representative of the recurrent IP deletion (Fig. 3A). The available sequence data suggested that duplication of one of the int3h copies would yield a 26 kb fragment. Thus, using PCR primers SA25F and INT3H-R2B, we successfully amplified across the duplication junction and confirmed the result by sequencing (data not shown). Since the int3h repeats in the normal NEMO gene are located in intron 3 and distal to the last exon (exon 10), the int3h duplication replicates exons 410 downstream of the normal NEMO exon 10 (Fig. 3B). Southern blot and PCR analyses both showed that patient XL206-03 inherited the duplication from her clinically normal father (XL206-01), indicating that this rearrangement was a polymorphism (Fig. 3C). Only patient XL206-03 and one control female individual exhibited this rearrangement from among more than 150 female IP and 48 normal unrelated individuals (approximately 400 total X chromosomes) tested, suggesting that it was very rare. We have not attempted to define whether the int3h duplication was within NEMO or NEMO since the rearrangement would result in identical fragment lengths at both locations. LAGE1LAGE2A inversion While examining 53 individuals (98 X chromosomes) for int3h deletions in NEMO, DNA from one female IP patient (XL328-04) revealed an 18 kb HindIII fragment, in addition to the normal 14 kb band on a Southern blot (Fig. 3A, right). Since the hybridizing probe also detected LAGE1, the available sequence supported a hypothesis that an inversion between the near-identical first exons of LAGE1 and LAGE2A would produce an 18 kb HindIII fragment. A LAGE1LAGE2A inversion would place NEMO adjacent to LAGE1 at the centromeric inversion junction and place LAGE2A upstream of NEMO at the telomeric end (Fig. 3B). Successful PCR amplification across the inversion junction with primers SA81F and SA82R, and confirmatory sequencing, in a DNA sample from patient XL328-04 confirmed this model (Fig. 3C). Unaffected relatives of IP patient XL328-04 also had the inversion, indicating that it represented a polymorphism. However, since it was found in only one out of 98 X chromosomes examined, it was rather rare. When we initially identified the recurrent deletion in IP, we proposed that a second, truncated copy of NEMO was present in the genome (2). This work confirms our hypothesis that NEMO was located telomeric to NEMO and contained only exons 310. A section of NEMO was duplicated along with LAGE2 1015 million years ago. The two copies derived from the duplication are in opposite orientation, are nearly identical, and are separated by 22 kb of non-duplicated sequence. We have detected evidence for sequence exchange between the two copies, possibly pointing to a mechanism for maintaining the high sequence identity between them. In addition, various rearrangements arise due to sequence homology between and within the two copies. We detected four different rearrangements, including the recurrent and lethal IP-associated deletion (2). The NEMO/LAGE2 duplication was likely mediated by repeats since SINE and LINE sequences are located at its boundaries. These repeat sequences have been known to play a role in duplications and deletions of specific genomic regions (1315). The int3h loci and both LAGE genes are also flanked by repeats. Our data suggest that the current structure of the NEMOLAGE1 genomic region evolved from two distinct events. The first event produced LAGE2 from LAGE1, since orangutan only contains LAGE1. Sequence comparison clearly shows resemblance between the two LAGE genes, particularly within coding regions. This first event is somewhat reminiscent of the duplicative evolution of the related MAGE genes, also in Xq28 (16). The original int3h sequence likely replicated within the ancestral NEMO around the same time as, or before, the LAGE1LAGE2 duplication. The second event, mediated again by repeats, duplicated part of NEMO and the entire LAGE2 gene together. Inversion of one of the copies may have occurred as a third step or taken place in conjunction with the NEMO/LAGE2 duplication event itself. Insertion of the second copy, NEMO/LAGE2, just telomeric to the first copy might have been facilitated by genomic instability in this region. The interval between the two NEMO/LAGE2 loci appears to be unstable, since a high-density clone contig in this region initially contained a gap that was eventually closed but with relatively sparse coverage (9). Most of the clones in the map also end between the two copies, with very few containing both loci. It is not unprecedented to find genomic instability associated with gene rearrangements. For instance, the XAP135 pseudogene is also inserted at an apparently unstable location near the int22h-2 repeat, which lies 200 kb from the NEMO/LAGE2 loci and is involved in some hemophilia A inversions (9). Many duplicated regions in the genome have originated during evolution of the great apes. For example, the CharcotMarie tooth disease-linked CMT1A-REP repeats originated before the lineage leading to chimpanzees and humans (17). The F8C-associated int22h repeat was duplicated prior to orangutan speciation, but a third copy originated in the common ancestor to gorilla, chimpanzee and human (9). Similarly, the NEMO/LAGE2 duplication occurred before the gorillachimpanzeehuman lineage. Although both coding and non-coding regions of the genome have undergone duplication, it is interesting that many of them are associated with human diseases. Thus, while duplications predispose genomic regions to both pathogenic intrachromosomal and interchromosomal rearrangements, they have not been selected against, possibly because of some evolutionary advantage. Though functional divergence is often thought to be the outcome of genomic duplications (18), another previously unexplored possibility is that multiple copies of a gene offer a means to prevent sequence alterations of the parent copy through mechanisms such as gene conversion. The remarkable sequence identity between the two copies of NEMO/LAGE2 is an unusual occurrence in the genome. However, a few repeat loci are similar to the NEMO/LAGE2 sequences (8,19), including the 9.5 kb int22h and the 24 kb CMT1A-REP loci, which have maintained significant (>98%) sequence homology between copies. The Hunter syndromeassociated IDS gene has also been partially duplicated nearby and both copies share >88% identity (11,20,21). Gene conversion has been proposed as a sequence conserving mechanism in all these cases, and supporting evidence has been presented for both the CMT1A-REP repeats and the IDS loci (15,21,22). Our finding that the DrdI polymorphism can be present in either copy of NEMO provides similar evidence for sequence exchange. However, we also considered that inversions between NEMO and NEMO might help maintain sequence identity; base changes in NEMO that are transferred to NEMO by inversion would likely be eliminated from the gene pool since most sequence alterations in NEMO are lethal (7). However, we had not anticipated that NEMO would only account for 26% of the entire duplication. Thus, inversions are less plausible but cannot be excluded; an inversion between the NEMO/LAGE2 copies would likely be non-pathogenic since all coding regions would be reconstituted. This scenario is not unprecedented; two 11.3 kb, inversely oriented repeats with >99% homology flank the FLN1 and EMD loci just upstream of NEMO, and they facilitate frequent inversion of the 48 kb intervening region (23). Another reason for the sequence conservation between the two copies might be that the NEMO/LAGE2 copy has important biological significance. Since it lacks a promoter and the first four exons, we had presumed that NEMO was not functional. Therefore, it was puzzling that we could not detect this potentially polymorphic NEMOint3h deletion in nearly 100 normal X chromosomes but found it in three unrelated IP patients and in all of their affected relatives. This suggested that NEMO might have a role in the pathogenesis of IP. These findings have important implications for studying IP and NEMO. Diagnostic testing for IP patients is complicated by the requirement of determining which NEMO copy contains the mutation. We currently use a Southern blot probe unique to NEMO to detect the recurrent deletion that accounts for the majority (7080%) of IP mutations (7). The remaining patients have smaller exonic mutations for which long-range PCR has to be used to verify that they are at the NEMO locus and not at NEMO. A few remaining patients have failed to exhibit mutations in NEMO, and these patients could potentially have alterations of NEMO if this locus has any functional significance. We are currently investigating the function of NEMO, although a preliminary database survey has failed to reveal evidence for an exclusive NEMO-linked transcript, lacking exons 1 and 2. In addition, a previously described male IP patient has shown only an exon 10 mutation in NEMO by RTPCR without the concurrent presence of a normal transcript sequence that might be derived from NEMO (2,24,25). Thus, NEMO does not appear to produce a transcript, but this possibility cannot be completely ruled out. We have tested the idea that NEMO could be spliced onto the LAGE1 transcript due to their relative positions to each other. However, RTPCR on lymphoblastderived RNA did not yield positive results, possibly because LAGE1 is not expressed in this tissue. Therefore, other tissues that express this gene, such as breast, skin, placenta and uterus, need to be examined. Unlike the int3h deletions in NEMO and NEMO, the LAGE1LAGE2A inversion appears to be non-pathogenic since it was found in a normal individual. This is not surprising since all of the genes involved are preserved. LAGE1 is separated from its promoter but is apparently functional with a LAGE2 promoter, probably because the two genes are evolutionarily related. The LAGE genes, similar to the MAGE, BAGE and GAGE genes, are expressed as antigens in various tumors and may cause disease phenotypes if disrupted (2628). Similar to the LAGE1LAGE2A inversion, the int3h duplication preserves normal gene sequences, although one might speculate that the duplicated exons 410 of NEMO might be spliced onto the end of an authentic NEMO transcript to create a larger-than-normal NEMO protein. However, it is not likely that the int3h duplication leads to an abnormal NEMO protein, since the C-terminal end of NEMO is known to be very sensitive to alterations and its disruption leads to either lethal or variant forms of IP (2,2931). In support of this, we found the int3h duplication in an unaffected member of an IP family and in one normal female individual. An interesting aspect of genomic rearrangements on the X chromosome is their parental origin. For instance, the inversion predominantly seen in hemophilia A patients occurs exclusively in the paternal germline (32,33). This is attributed to the fact that the X chromosome remains unpaired during male meiosis. We have previously reported that the common IP deletion also shows a bias towards paternal origin but unlike the hemophilia A inversion, it is seen in female meioses as well (7,34). However, the relative distances between the repeats involved in each of these genomic disorders are very different. The int3h repeats associated with the IP deletion are separated by 11 kb, whereas the int22h repeats that predispose to the hemophilia A inversion are >100 kb apart. In this respect, it is likely that the int3h duplication also shows the same frequencies of parental origin as the int3h deletion. In contrast, the LAGE1LAGE2A inversion occurs between identical sequences separated by nearly 68 kb and thus is more reminiscent of the hemophilia A inversion and might show a paternal origin bias. Unfortunately, the parental origin of the LAGE1LAGE2A inversion could not be evaluated due to the lack of DNA samples from patient XL328-04s parents. The NEMO/LAGE2 duplication is complex compared to other reported genomic duplications because of additional internal sections of perfect homology within each copy (i.e. the int3h repeats and the LAGE genes). One might expect other types of alterations in addition to those reported here; therefore, a thorough analysis of the NEMO/ NEMO region in normal individuals would be interesting. Moreover, the non-pathogenic int3h duplication and the LAGE1LAGE2A inversion could be transmitted to offspring in whom additional, novel rearrangements might occur. For example, the LAGE1LAGE2A inversion places the two NEMO/LAGE2 copies in tandem and in the same orientation, consequently predisposing one copy to deletion. As mentioned earlier, a non-pathogenic inversion between the NEMO/LAGE2 copies would represent a fifth type of rearrangement, but we have not found this yet. If it exists, this rearrangement might be detectable by pulse-field gel electrophoresis, but it would be difficult because of the scarcity of appropriate restriction sites. Finally, we have recently detected three novel restriction fragments on a diagnostic Southern blot intended to identify the common IP deletion. These unusual fragments are currently under investigation and likely represent new types of rearrangements at the NEMO/LAGE2 duplication that will certainly provide greater insight into the dynamic nature of this region. Collectively, this work suggests that genomic rearrangements may be more common than expected and may account for significant polymorphism in a general population. We currently lack efficient computational tools to detect large-scale sequence homologies that undergo benign rearrangements, which can appropriately be called genomic polymorphisms, in contrast to smaller single-nucleotide or short tandem repeat polymorphisms. With the completion of the Human Genome Project, sequence analysis tools will likely demonstrate that an appreciable proportion of our genome has evolved from duplication. MATERIALS AND METHODS Isolation of BAC clones containing NEMO and NEMO A contig had been constructed previously by screening the Genome Systems female BAC library, the RP11 male BAC library and the RP6 female PAC library with several probes in distal Xq28 (9). To enrich for clones near NEMO, the RP11 male BAC and RP6 female PAC libraries were screened with the NEMO cDNA and the sWXD1332 marker. Additional clones were isolated from the RP5 male PAC library for sequencing the region between NEMO and LAGE1 (12). Probe walking to find duplication boundary To detect the 3 duplication boundary, DNA samples of RP11 BAC clones 211L10, 103M23, 66O13 and 515D14 were digested with various enzymes and transferred to a nylon membrane. The filter was prehybridized for 4 h and hybridized at 65C with the appropriate probe overnight, starting with probe A (primers NEMO-10F and SA22R). The filter was washed to a final stringency of 1 SSC/0.1% SDS at 65C and autoradiographed for 30 min to 2 h. To subclone overlapping fragments, a shotgun library was made with the appropriate enzyme from RP11-211L10 in a pBlueScript vector and hybridized with the relevant probe. Restriction digests confirmed the identity of positive colonies. The target bands from a restriction digest were gel-purified to extract successive probes from isolated clones. Southern blot analysis To analyze for the presence of NEMO and determine the genomic structure between the duplicated copies, 5 g of DNA was used for the following: human male and female (Homo sapiens, HSA), two male common chimpanzees (Pan troglodytes, PTR), one male bonobo chimpanzee (Pan paniscus, PPA), one male gorilla (Gorilla gorilla, GGO), and two male orangutans (Pan pygmaeus, PPY). DNA samples were digested with appropriate restriction enzymes overnight and electrophoresed on a 0.8% agarose gel for 24 h at 75 V. Following overnight transfer, the blots were hybridized with appropriate probes overnight at 65C and washed to a stringency of 2 SSC/0.1% SDS. Sequence generation and analysis PAC clones RP5-865E18 and 1087L19 were sequenced as described previously (12). The GenBank accession no. is AF277315. The region between NEMO and LAGE1 was also sequenced from BAC clones RP11-211L10 and 196H18 (GenBank accession no. AL596249). The sequences of both BAC clones were determined by a combined strategy of shotgun sequencing of M13 subclones and long-range genomic PCR products. BAC DNA was sonicated and the ends repaired with T4 DNA polymerase. Fragments of 12 kb were fractionated from the mixture by agarose gel electrophoresis and subcloned into M13mp18 vector prepared by digestion with SmaI. In total, 1800 subclones from BAC 196H18 and 1100 from BAC 211L10 were randomly selected and sequenced using Big-dye Terminator reactions on Applied Biosystems 3100 and 377 PRISM automated sequencers. Sequence traces were assembled using Applied Biosystems FACTURA and INHERIT programs. These programs simultaneously assemble sequence files and facilitate subsequent editing to obtain a consensus sequence. Several smaller regions within the NEMO/LAGE2 duplication that posed difficulties due to nucleotide differences between the two copies were sequenced directly from RP11-211L10 (containing NEMO) and from RP6-172D5 (containing NEMO). PCR products were purified using a purification kit (Qiagen) and submitted to SeqWright (Houston, TX) for fluorescent sequencing. All sequences, including GenBank entries, were analyzed with MacVector (Oxford Molecular Group, Cambridge, UK). Sequence exchange detectionthe DrdI assay To detect sequence exchange between the two NEMO/LAGE2 copies, a long-range PCR and digestion assay was used. A DrdI polymorphism (Table 1, no. 6) was present in intron 4 of NEMO. Specific PCR products of 5 kb were amplified from NEMO with primer SA15F and separately, from NEMO with primer SA65F. Both forward primers were used with the same reverse primer, NEMO-5R. The long-range PCR was done at 62C annealing temperature in a 50 l reaction with the EXPAND PCR kit (Roche). After verifying the correct product size, 25 l of the PCR sample was digested with 1 U of DrdI for 4 h. The digest was electrophoresed on a 1% agarose gel and photographed under UV light. Genomic DNA samples Blood samples for IP patients were obtained through an IRB approved protocol, and DNA was extracted from these samples using conventional salt precipitation techniques. Genomic DNA samples from the three non-human great apes were isolated from lymphoblast cell lines maintained by D.L.Nelson. ACKNOWLEDGEMENTS We thank Kerry L.Wright for editing the manuscript. Evelyn Michaelis and Hella Ludewig provided excellent technical assistance. This work was supported by NIH grants 5 R01 HD35617 and 2 P30 HD24064 to D.L.N, Telethon-Italy grant E0927 to M.D. and German BMBF (BEO 0311108/0) and European Commission (BMH4-CT96-0338) grants to M.P. NOTE ADDED IN PROOF We recently re-evaluated the sequences of the two NEMO/LAGE2 copies to a greater depth and found that the number of base differences between them is around 45 to 50 (instead of the 22 we have listed in this paper). However, the sequence identity between the two copies still exceeds 99%, and all of the points in this paper remain valid. REFERENCES


This is a preview of a remote PDF: http://hmg.oxfordjournals.org/content/10/22/2557.full.pdf

Swaroop Aradhya, Tiziana Bardaro, Petra Galgóczy, Takanori Yamagata, Teresa Esposito, Henry Patlan, Alfredo Ciccodicola, Arnold Munnich, Sue Kenwrick, Matthias Platzer, Michele D’Urso, David L. Nelson. Multiple pathogenic and benign genomic rearrangements occur at a 35 kb duplication involving the NEMO and LAGE2 genes, Human Molecular Genetics, 2001, 2557-2567, DOI: 10.1093/hmg/10.22.2557