Major influence of repetitive elements on disease-associated copy number variants (CNVs)
Cardoso et al. Human Genomics
Major influence of repetitive elements on disease-associated copy number variants (CNVs)
Ana R. Cardoso 0 1 2 3 4
Manuela Oliveira 0 1 2 3 4
Antonio Amorim 0 1 2 3 4
Luisa Azevedo 0 1 2 3 4
0 IPATIMUP-Institute of Molecular Pathology and Immunology, University of Porto , Rua Júlio Amaral de Carvalho 45, 4200-135 Porto , Portugal
1 Instituto de Investigação e Inovação em Saúde, Universidade do Porto , Rua Alfredo Allen 208, 4200-135 Porto , Portugal
2 Department of Biology, Faculty of Sciences, University of Porto , Rua do Campo Alegre S/N, 4169-007 Porto , Portugal
3 IPATIMUP-Institute of Molecular Pathology and Immunology, University of Porto , Rua Júlio Amaral de Carvalho 45, 4200-135 Porto , Portugal
4 Instituto de Investigação e Inovação em Saúde, Universidade do Porto , Rua Alfredo Allen 208, 4200-135 Porto , Portugal
Copy number variants (CNVs) are important contributors to the human pathogenic genetic diversity as demonstrated by a number of cases reported in the literature. The high homology between repetitive elements may guide genomic stability which will give rise to CNVs either by non-allelic homologous recombination (NAHR) or non-homologous end joining (NHEJ). Here, we present a short guide based on previously documented cases of disease-associated CNVs in order to provide a general view on the impact of repeated elements on the stability of the genomic sequence and consequently in the origin of the human pathogenic variome.
Copy number variants (CNVs); Genetic diseases; Genomic structural variation; Low copy repeats; Retrotransposons; LINE; SINE; Non-allelic homologous recombination (NAHR)
-
Background
Copy number variants (CNVs) are structural genomic
markers (insertions or deletions) ranging in size from
1 kb to several megabytes for each copy. They are
categorized as copy number polymorphisms (CNPs) when
multiple allelic states exist in the population or as rare
copy number variants when they are found to be
associated with genetic diseases (pathogenic copy number
variants) [1, 2]. The origin of each repeated element of
the CNV is influenced by the local genomic architecture
which includes the presence of repetitive sequences
within or flanking the repeated segment [3–7]. These
repeated sequences drive non-allelic homologous
recombination (NAHR) events which result in recurrent
insertions and deletions with similar sequence sizes and
clustered breakpoints [3, 6, 8] or non-homologous end
joining (NHEJ) events that result in non-recurrent
rearrangements that vary in terms of their size and
breakpoint location [3, 6, 9]. Although several studies have
been demonstrating the contribution of structural
variants to the genome architecture, few have specifically
focused the influence of repeated sequences at
breakpoint locations. With the aim to draw attention to these
unstable regions and to establish their role in CNVs, we
collated a number of cases of CNV-associated disorders
proven to have been generated by low and high copy
number repeats which may have influenced the degree
of stability of the genomic sequence.
Low copy repeats and their influence on
pathogenic CNV formation
Low copy repeats (LCRs) are homologous sequences
of ≥1 kb in length which are found in many copies
throughout the genome since they are generated by
duplication events [3, 10]. Large LCRs (>10 kb) with
high sequence homology promote non-allelic
homologous recombination (NAHR) [3–6, 10–12] and the
misalignment of directly oriented sister chromatids
carrying the LCR may promoted NAHR thereby
generating both duplications and deletions [4, 5] which in
turn give rise to copy number variation. A schematic
representation of this process is shown in Fig. 1.
Certain properties of the LCRs such as homology
length, sequence similarity, and distance, serve to
influence the frequency of NAHR events [3, 6, 12] (Fig. 1). As
recently reviewed by Carvalho and Lupski [3], the NAHR
© 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Fig. 1 Optimal LCRs features for the occurrence of NAHR events that result in CNV formation. Distinct LCR pairs with counter features such as
homology, size, and inter-LCR distance influence NAHR rate and lead to the formation of common recurrent (a) or rare recurrent (b) copy number
variants. Adapted from [3, 6, 12]
rate varies according to the length of the LCR sequence,
the distance between distinct LCR sequences (...truncated)