Whole Genome Distribution and Ethnic Differentiation of Copy Number Variation in Caucasian and Asian Populations
et al. (2009) Whole Genome Distribution and Ethnic Differentiation of Copy Number Variation in Caucasian and
Asian Populations. PLoS ONE 4(11): e7958. doi:10.1371/journal.pone.0007958
Whole Genome Distribution and Ethnic Differentiation of Copy Number Variation in Caucasian and Asian Populations
Jian Li 0
Tielin Yang 0
Liang Wang 0
Han Yan 0
Yinping Zhang 0
Yan Guo 0
Feng Pan 0
Zhixin Zhang 0
Yumei Peng 0
Qi Zhou 0
Lina He 0
Xuezhen Zhu 0
Hongyi Deng 0
Shawn Levy 0
Christopher J. Papasian 0
Betty M. Drees 0
James J. Hamilton 0
Robert R. Recker 0
Jing Cheng 0
Hong-Wen Deng 0
Florian Kronenberg, Innsbruck Medical University, Austria
0 1 School of Medicine, University of Missouri Kansas City, Kansas City, Missouri, United States of America, 2 The Key Laboratory of Biomedical Information Engineering of Ministry of Education and Institute of Molecular Genetics, School of Life Science and Technology, Xi'an Jiaotong University , Xi'an, Shanxi , People's Republic of China, 3 Vanderbilt Microarray Shared Resource, Vanderbilt University , Nashville , Tennessee, United States of America, 4 Osteoporosis Research Center, Creighton University , Omaha , Nebraska, United States of America, 5 National Engineering Research Center for Beijing Biochip Technology , Changping District, Beijing , People's Republic of China, 6 Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University , Changsha, Hunan , People's Republic of China
Although copy number variation (CNV) has recently received much attention as a form of structure variation within the human genome, knowledge is still inadequate on fundamental CNV characteristics such as occurrence rate, genomic distribution and ethnic differentiation. In the present study, we used the Affymetrix GeneChipH Mapping 500K Array to discover and characterize CNVs in the human genome and to study ethnic differences of CNVs between Caucasians and Asians. Three thousand and nineteen CNVs, including 2381 CNVs in autosomes and 638 CNVs in X chromosome, from 985 Caucasian and 692 Asian individuals were identified, with a mean length of 296 kb. Among these CNVs, 190 had frequencies greater than 1% in at least one ethnic group, and 109 showed significant ethnic differences in frequencies (p,0.01). After merging overlapping CNVs, 1135 copy number variation regions (CNVRs), covering approximately 439 Mb (14.3%) of the human genome, were obtained. Our findings of ethnic differentiation of CNVs, along with the newly constructed CNV genomic map, extend our knowledge on the structural variation in the human genome and may furnish a basis for understanding the genomic differentiation of complex traits across ethnic groups.
-
Funding: Investigators of this work were partially supported by grants from NIH (R01 AR050496-01, R21 AG027110, R01 AG026564, and P50 AR055081). The
study also benefited from grants from National Science Foundation of China, Huo Ying Dong Education Foundation, Xian Jiaotong University, and the Ministry of
Education of China. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
Variation within the human genome can take many different
forms. One form of structural variation is copy number variation
(CNV), in which a DNA segment, ranging from 1 kb to several
megabases, is present at a variable copy number in comparison to
a reference genome [1]. CNVs are widespread in the human
genome, and vary across populations with respect to rate of
occurrence [27]. CNVs have been shown to account for nearly
18% of variation in gene expression and, consequently, may play
an important role in determining complex traits [8]. CNVs have
been associated with certain complex human diseases, such as
susceptibility to HIV infection, selected autoimmune diseases,
tumors and psychiatric disorders such as mental retardation and
autism [914].
Although several studies have been performed to characterize
genomic CNVs, comparing results from these studies has been
hindered by small sample sizes and different study designs and
analytical methods. Consequently, it has been difficult to combine
results from different studies to produce an accurate description of
genomic CNV characteristics such as the total number, genomic
position, gene content, and frequency distribution [7]. It is even
more difficult to robustly detect CNV differentiation across ethnic
groups, and this has limited the utility of CNVs for association
studies and human evolution research. One approach that can
minimize the problems listed above is to use large sample sizes
comprised of subjects from comparatively homogeneous ethnic
backgrounds for each study population [15]. Recent technologic
developments such as the availability of high-density SNP
microarrays have also been helpful, in terms of providing an efficient and
affordable tool for CNV discovery in the human genome.
In this study, we utilized the Affymetrix GeneChipH Mapping
500K Array, in which one SNP was placed approximately every
5.8 kb along the human genome, to identify CNVs in both a US
Caucasian population and a Chinese Han population. CNVs were
identified and characterized based on probe intensities and SNP
genotypes, and their ethnic differences were studied. The results
extend our understanding on the structural variation in the human
genome and may furnish a basis for understanding the genomic
differentiation of complex traits across ethnic groups.
Brief summaries of CNV and CNVR (copy number variation
region, which is a region covered by overlapping CNVs)
characteristics in each ethnic group were shown in Table 1, with
detailed summaries being presented in the corresponding
supplementary tables.
Characteristics of CNVs
There were 2,381 autosomal CNVs identified in the 1,677
subjects (Table S1), with a median length of 198 kb and a mean
length of 298 kb. Although CHI had a smaller sample size, the
numbers of CNVs identified in the two ethnic groups were similar:
1,352 CNVs in CAU versus 1,395 CNVs in CHI. Other CNV
characteristics that were similar in the two populations include the
average number of CNVs per individual (,9 CNVs per
individual, ranging from 132, in CAU versus ,10 CNVs per
individual, ranging from 244 in CHI (Figure 1A), the median
size of CNVs (195 kb in CAU vs. 196 kb in CHI), and the mean
size of CNVs (295 kb in CAU vs. 303 kb in CHI) (Figure 1B).
Although a great percentage of CNVs were singletons, 27.6%
were present more than once in our samples. Specifically, 168 or
7% of the 2,381 CNVs were common CNVs, defined as CNVs
with a frequency of 1% or greater in at least one ethnic group
(Table S2).
There were 638 CNVs identified on the X chromosome in our
subjects (Table S1), with a median length of 206 kb and a mean
length of 288 kb, similar to those of autosomal chromosomes. For
these 638 CNVs, 183 (29%) were detec (...truncated)