Analysis of copy number variations in the sheep genome using 50K SNP BeadChip array
Jiasen Liu
0
2
3
Li Zhang
0
3
Lingyang Xu
0
3
Hangxing Ren
1
Jian Lu
0
3
Xiaoning Zhang
0
3
Shifang Zhang
0
3
Xinlei Zhou
0
3
Caihong Wei
0
3
Fuping Zhao
0
3
Lixin Du
0
3
0
National Center for Molecular Genetics and Breeding of Animal, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences
,
Beijing 100193
,
People's Republic of China
1
Chongqing Academy of Animal Sciences
,
Chongqing 402460
,
People's Republic of China
2
Institute of Animal Science, Inner Mongolia Academy of Agricultural & Animal Husbandry Sciences
,
Hohhot, Inner Mongolia Autonomous Region 010031
,
People's Republic of China
3
National Center for Molecular Genetics and Breeding of Animal, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences
,
Beijing 100193
,
People's Republic of China
Background: In recent years, genome-wide association studies have successfully uncovered single-nucleotide polymorphisms (SNPs) associated with complex traits such as diseases and quantitative phenotypes. These variations account for a small proportion of heritability. With the development of high throughput techniques, abundant submicroscopic structural variations have been found in organisms, of which the main variations are copy number variations (CNVs). Therefore, CNVs are increasingly recognized as an important and abundant source of genetic variation and phenotypic diversity. Results: Analyses of CNVs in the genomes of three sheep breeds were performed using the Ovine SNP50 BeadChip array. A total of 238 CNV regions (CNVRs) were identified, including 219 losses, 13 gains, and six with both events (losses and gains), which cover 60.35 Mb of the sheep genomic sequence and correspond to 2.27% of the autosomal genome sequence. The length of the CNVRs on autosomes range from 13.66 kb to 1.30 Mb with a mean size of 253.57 kb, and 75 CNVRs events had a frequency > 3%. Among these CNVRs, 47 CNVRs identified by the PennCNV overlapped with the CNVpartition. Functional analysis indicated that most genes in the CNVRs were significantly enriched for involvement in the environmental response. Furthermore, 10 CNVRs were selected for validation and 6 CNVRs were further experimentally confirmed by qPCR. In addition, there were 57 CNVRs overlapped in our new dataset and other published ruminant CNV studies. Conclusions: In this study, we firstly constructed a sheep CNV map based on the Ovine SNP50 array. Our results demonstrated the differences of two detection tools and integration of multiple algorithms can enhance the detection of sheep genomic structure variations. Furthermore, our findings would be of help for understanding the sheep genome and provide preliminary foundation for carrying out the CNVs association studies with economically important phenotypes of sheep in the future.
-
Background
In recent years, genome-wide association studies (GWAS)
have successfully uncovered single-nucleotide
polymorphisms (SNPs) associated with complex diseases or traits
[1]. With the rapid development of chip array-based
genotyping techniques, thousands of genomic submicroscopic
structural variations have been found in the human
genome [2]. As a main genetic form of submicroscopic
structural variation copy number variations (CNVs) are widely
distributed in the human genome and influence gene
expression, phenotypic variation and adaptation by
disrupting genes and altering gene dosage [2-5]. Numerous
studies showed that CNVs contributed to both disease
susceptibility and phenotypic diversity [2,5]. Now, CNV is
increasingly considered to be an important and abundant
source of genetic variation and phenotypic diversity [5,6].
Investigations on CNVs have been successively carried
out in human and other species [7-13]. In the domestic
animals, there are involving in cattle [14-20], dog [21],
chicken [22], pig [23,24], goat [25], sheep [26] and rabbit
[27]. As for sheep, Fontanesi et al. [26] provided a first
comparative map of CNVs of the sheep genome re
ferred to the cattle genome using a cross-species array
comparative genome hybridization(aCGH). However, the
cross-species analysis based on heterologous hybridization
couldnt identify all detectable CNVRs due to low
homology between cattle probes and sheep DNA for some
regions and doesnt show the CNVR distributions on the
sheep genome. In addition to CGH, another major
platform commonly used to identify CNVs is the SNP array.
In SNP array, intensity values of SNPs derived from each
sample are used to detect CNVs in each individual.
Comparison with two panels, CGH array has excellent
performance in signal-to noise ratios, while the SNP array
based approach is more convenient for high-throughput
analyses and follow-up association studies [28]. With the
development of high density SNP arrays, higher resolution
of genomic regions can be achieved [29]. Moreover, due to
their low cost and high-density, commercial SNP arrays
have been widely used for CNV detection in domestic
animals, and CNV mapping and functional studies have
made important progress. However, there are no reports
on CNV detection of sheep based on SNP array data.
In this study, we will investigate genome-wide CNV in
three sheep populations. To pursue convincing results, we
firstly employ the PennCNV program to analyze Ovine
SNP50 genotyping data, and then use other algorithm
program, cnvPartition, to validate CNVRs detected by
PennCNV. To our knowledge, we will construct the first
sheep CNV map based on SNP array data. This research
will provide useful addition to the sheep CNV maps, and
will provide potential genetic markers for further
investigation on the roles of CNV in sheep productive traits and
evolutionary adaptation.
Results
SNP genotyping
The genomic DNA of 329 individual samples from three
sheep breeds (German Mutton sheep, Dorper and Sunite
sheep) were genotyped using Illumina OvineSNP50
Genotyping BeadChip according to the manufacturers protocol,
and the PennCNV (http://www.openbioinformatics.org/
penncnv) software was used to identify the CNVs in the
sheep genome (Table 1). According to the results of
PennCNV, we defined the CNV call filtering criteria to
exclude samples with low quality of signal intensity data.
Table 1 Population size information in sheep copy
number variation analysis
No, of sheepa PennCNVb
a: total samples before quality control by PennCNV and CNVpartition.
b: The samples that passed the quality control of PennCNV in 329 individuals
from three sheep breeds.
c: The samples that passed the quality control of CNVpartition in 329
individuals from three sheep breeds.
After applying the CNV quality control criteria detailed in
the Methods section, 256 samples (157 German Mutton,
35 Dorper and 64 Sunite sheep) remained for further
CNV analyses.
Genome-wide surveys of sheep CNVs
After filtering unreliable CNV calls, we discovered a total
of 3624 CNV events (3416 losses and 208 gains), with an
average number of 14.16 CNV events per individual.
The average and m (...truncated)