Copy Number Variation in Thai Population

PLOS ONE, Dec 2019

Copy number variation (CNV) is a major genetic polymorphism contributing to genetic diversity and human evolution. Clinical application of CNVs for diagnostic purposes largely depends on sufficient population CNV data for accurate interpretation. CNVs from general population in currently available databases help classify CNVs of uncertain clinical significance, and benign CNVs. Earlier studies of CNV distribution in several populations worldwide showed that a significant fraction of CNVs are population specific. In this study, we characterized and analyzed CNVs in 3,017 unrelated Thai individuals genotyped with the Illumina Human610, Illumina HumanOmniexpress, or Illumina HapMap550v3 platform. We employed hidden Markov model and circular binary segmentation methods to identify CNVs, extracted 23,458 CNVs consistently identified by both algorithms, and cataloged these high confident CNVs into our publicly available Thai CNV database. Analysis of CNVs in the Thai population identified a median of eight autosomal CNVs per individual. Most CNVs (96.73%) did not overlap with any known chromosomal imbalance syndromes documented in the DECIPHER database. When compared with CNVs in the 11 HapMap3 populations, CNVs found in the Thai population shared several characteristics with CNVs characterized in HapMap3. Common CNVs in Thais had similar frequencies to those in the HapMap3 populations, and all high frequency CNVs (>20%) found in Thai individuals could also be identified in HapMap3. The majorities of CNVs discovered in the Thai population, however, were of low frequency, or uniquely identified in Thais. When performing hierarchical clustering using CNV frequencies, the CNV data were clustered into Africans, Europeans, and Asians, in line with the clustering performed with single nucleotide polymorphism (SNP) data. As CNV data are specific to origin of population, our population-specific reference database will serve as a valuable addition to the existing resources for the investigation of clinical significance of CNVs in Thais and related ethnicities.

Copy Number Variation in Thai Population

Citation: Suktitipat B, Naktang C, Mhuantong W, Tularak T, Artiwet P, et al. ( Copy Number Variation in Thai Population Bhoom Suktitipat Chaiwat Naktang Wuttichai Mhuantong Thitima Tularak Paramita Artiwet Ekawat Pasomsap Wallaya Jongjaroenprasert Suthat Fuchareon Surakameth Mahasirimongkol Wasan Chantratita Boonsit Yimwadsana Varodom Charoensawan Natini Jinawath Jeong-Sun Seo, Seoul National University College of Medicine, Republic Of Korea Copy number variation (CNV) is a major genetic polymorphism contributing to genetic diversity and human evolution. Clinical application of CNVs for diagnostic purposes largely depends on sufficient population CNV data for accurate interpretation. CNVs from general population in currently available databases help classify CNVs of uncertain clinical significance, and benign CNVs. Earlier studies of CNV distribution in several populations worldwide showed that a significant fraction of CNVs are population specific. In this study, we characterized and analyzed CNVs in 3,017 unrelated Thai individuals genotyped with the Illumina Human610, Illumina HumanOmniexpress, or Illumina HapMap550v3 platform. We employed hidden Markov model and circular binary segmentation methods to identify CNVs, extracted 23,458 CNVs consistently identified by both algorithms, and cataloged these high confident CNVs into our publicly available Thai CNV database. Analysis of CNVs in the Thai population identified a median of eight autosomal CNVs per individual. Most CNVs (96.73%) did not overlap with any known chromosomal imbalance syndromes documented in the DECIPHER database. When compared with CNVs in the 11 HapMap3 populations, CNVs found in the Thai population shared several characteristics with CNVs characterized in HapMap3. Common CNVs in Thais had similar frequencies to those in the HapMap3 populations, and all high frequency CNVs (.20%) found in Thai individuals could also be identified in HapMap3. The majorities of CNVs discovered in the Thai population, however, were of low frequency, or uniquely identified in Thais. When performing hierarchical clustering using CNV frequencies, the CNV data were clustered into Africans, Europeans, and Asians, in line with the clustering performed with single nucleotide polymorphism (SNP) data. As CNV data are specific to origin of population, our population-specific reference database will serve as a valuable addition to the existing resources for the investigation of clinical significance of CNVs in Thais and related ethnicities. - Funding: The current project was supported by the Thailand Research Fund (http://www.trf.or.th), the Commission on Higher Education, and Mahidol University (TRF-CHE-MU grant number MRG 5480183) to NJ. BS is supported by Chalermphrakiat grant, Faculty of Medicine Siriraj Hospital, Mahidol University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. Copy Number Variation (CNV) is one of the major genetic variations observed among genomes of individuals. CNVs constitute more total nucleotides than Single Nucleotide Polymorphisms (SNP), accounting for almost 12% of the human genome, and are of important in terms of genetic diversity as well as human evolution [1]. At present, several conditions with genetic etiologies, such as autism spectrum disorder, developmental delay, and nonsyndromic multiple congenital anomalies, are well documented to have CNVs among the causative variants [2]. For this reason, array-based technology, which is commonly used for CNV identification, has been recommended as a first-tier diagnostic tool for these particular disorders [3]. To make an accurate clinical interpretation of CNVs, both databases containing reference CNVs from genetic disease patients and normal controls are required. Large databases consisting of CNVs and clinical information of patients with chromosomal disorders such as DECIPHER [4] and the International Collaboration for Clinical Genomics (ICCG; http://www.iccg.org/) are actively curated by working groups. However, most patients are of European descent due to the availability and easy accessibility of clinical CNV testing in North America and Europe. Apart from these, there are currently a few other large public CNV databases containing CNV information of control subjects from certain ethnic groups, such as Caucasian, African-American, and Asian American [5,6]. These general population databases greatly help with clinical interpretation of CNVs, which can be divided into three main categories: pathogenic, uncertain clinical significance, or benign [7]. Recently, publications focusing on CNVs of specific ethnicities such as Koreans [8], Europeans [9], and Chinese [10] emphasize the fact that there are significant amount of population-specific CNVs. So far the number of Thai individuals represented in the existing databases for CNV in general population is very limited [11], and thus they are by no means the ideal references for CNV interpretation in Thais. The International Haplotype Map Project phase III (HapMap3) has made publicly accessible SNP genotyping and CNV data of more than a thousand subjects from 11 different ethnic groups, e.g. European, African, and East Asian ancestries [12]. HapMap3 dataset provides an opportunity to compare genetic variations across populations. Hence, CNVs in a larger sample of Thai individuals can be characterized and distinguished from those of East Asian and other populations. In this study, we combined the genomics data generated from multiple genome-wide association studies (GWAS) consisting of 3,017 unrelated Thai subjects with no undiagnosed genetic disorders. We carried out CNV discovery from these dataset using the two commonly used CNV calling algorithms, PennCNV [13] and CNV Workshop [14], to identify the most accurate set of CNVs, and put together the first large reference CNV database for Thais. Furthermore, we performed population Copy Number Variation Region (CNVR) frequency comparison between Thais and 11 HapMap3 populations, and identified unique CNVRs in Thais as well as CNVs overlapping with genes associated with Thai population. Genetic similarity between each population was also explored using hierarchical clustering analysis (HCA) based on the CNV frequencies. The Thai CNV database should contribute to a more accurate clinical interpretation of CNVs in Thai patients and serve as the starting point for future population genetics and genetic epidemiology studies. Materials and Methods Study populations The study population were compiled from previously published genome-wide association studies (GWAS) in Thai individuals [15,16,17,18,19], which were generated under collaborations between the Ministry of Public Health, Thailand, Thailand Center of Excellence for Life Sciences (TCELS), and the RIKEN Center for Genomic Medicin (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0104355&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0104355

Bhoom Suktitipat, Chaiwat Naktang, Wuttichai Mhuantong, Thitima Tularak, Paramita Artiwet, Ekawat Pasomsap, Wallaya Jongjaroenprasert, Suthat Fuchareon, Surakameth Mahasirimongkol, Wasan Chantratita, Boonsit Yimwadsana, Varodom Charoensawan, Natini Jinawath. Copy Number Variation in Thai Population, PLOS ONE, 2014, Volume 9, Issue 8, DOI: 10.1371/journal.pone.0104355