Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins

PLOS ONE, Feb 2018

Since milk yield is a highly important economic trait in dairy cattle, the genome-wide association study (GWAS) is vital to explain the genetic architecture underlying milk yield and to perform marker-assisted selection (MAS). In this study, we adopted a haplotype-based empirical Bayesian GWAS to identify the loci and candidate genes for milk yield. A total of 1 092 Holstein cows were sequenced by using the genotyping by genome reducing and sequencing (GGRS) method. After filtering, 164 312 high-confidence SNPs and 13 476 haplotype blocks were identified to use for GWAS. The results indicated that 17 blocks were significantly associated with milk yield. We further identified the nearest gene of each haplotype block and annotated the genes with milk-associated quantitative trait locus (QTL) intervals and ingenuity pathway analysis (IPA) networks. Our analysis showed that four genes, DLGAP1, AP2B1, ITPR2 and THBS4, have relationships with milk yield, while another three, ARHGEF4, TDRD1 and KIF19, were inferred to have potential relationships. Additionally, a network derived from the IPA containing one inferred (ARHGEF4) and all four confirmed genes likely regulates milk yield. Our findings add to the understanding of identifying the causal genes underlying milk production traits and could guide follow up studies for further confirmation of the associated genes, pathways and biological networks.

Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins

RESEARCH ARTICLE Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins Zhenliang Chen1,2, Yunqiu Yao1, Peipei Ma1,2, Qishan Wang1,2*, Yuchun Pan1,2* 1 Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, PR China, 2 Shanghai Key Laboratory of Veterinary Biotechnology, Shanghai, PR China a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Chen Z, Yao Y, Ma P, Wang Q, Pan Y (2018) Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins. PLoS ONE 13(2): e0192695. https://doi.org/10.1371/journal.pone.0192695 Editor: Qin Zhang, China Agricultural University, CHINA Received: August 24, 2017 Accepted: January 29, 2018 Published: February 15, 2018 Copyright: © 2018 Chen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: The SNP and phenotype data are freely available at public repository Dryad (https://doi.org/10.5061/dryad. cs133). Funding: This work was supported by National Natural Science Foundation of China (31370043, 31672386) to Qishan Wang. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. * (YP); (QW) Abstract Since milk yield is a highly important economic trait in dairy cattle, the genome-wide association study (GWAS) is vital to explain the genetic architecture underlying milk yield and to perform marker-assisted selection (MAS). In this study, we adopted a haplotype-based empirical Bayesian GWAS to identify the loci and candidate genes for milk yield. A total of 1 092 Holstein cows were sequenced by using the genotyping by genome reducing and sequencing (GGRS) method. After filtering, 164 312 high-confidence SNPs and 13 476 haplotype blocks were identified to use for GWAS. The results indicated that 17 blocks were significantly associated with milk yield. We further identified the nearest gene of each haplotype block and annotated the genes with milk-associated quantitative trait locus (QTL) intervals and ingenuity pathway analysis (IPA) networks. Our analysis showed that four genes, DLGAP1, AP2B1, ITPR2 and THBS4, have relationships with milk yield, while another three, ARHGEF4, TDRD1 and KIF19, were inferred to have potential relationships. Additionally, a network derived from the IPA containing one inferred (ARHGEF4) and all four confirmed genes likely regulates milk yield. Our findings add to the understanding of identifying the causal genes underlying milk production traits and could guide follow up studies for further confirmation of the associated genes, pathways and biological networks. Introduction As a highly important trait for breeding, milk yield is directly associated with the economic factors of dairy farming since increased milk yield allows for greater benefits. With the aid of huge advances in marker technology, it is possible for us to dissect heritable quantitative traits such as milk production by mapping the underlying genomic region or quantitative trait locus (QTL). To date, 2 437 QTL intervals correlated with milk yield have been reported on Animal QTLdb for cattle (http://www.animalgenome.org, Release 32, Apr 27, 2017). However, the QTL mapping study traditionally uses a linkage analysis to map QTLs, which results in overlarge intervals that make it difficult to identify the underlying mutation and improve breeding with the use of marker information [1]. PLOS ONE | https://doi.org/10.1371/journal.pone.0192695 February 15, 2018 1 / 13 Haplotype-based GWAS on milk yield in Holsteins With the advent of high-throughput, single-nucleotide polymorphisms (SNPs) genotyping, the genome-wide panels of SNPs allow for a genome-wide association study (GWAS) to explore the genes associated with the complex traits of interest. Compared to the traditional QTL mapping methods, the advantage of GWAS lies in its more precise intervals. Therefore, GWAS has become a widely accepted approach to explore the association between markers and the trait. There are a few GWASs using single-point analysis to identify the key genes for milk yield[2, 3]. For example, Jiang et al. performed a GWAS for milk production traits in a Chinese Holstein population and identified 20 significant genome-wide SNPs for milk yield [2]. However, though GWASs almost always use single-point analysis, the construction of haplotype blocks and identification of tag SNPs are quite informative in the identification of markers [4]. A haplotype analysis with data from a GWAS study proved that it substantially improved the amount of the phenotypic variance explained, compared with single SNPs from a particular region of the genome [5]. Indeed, often neglected as a tool, haplotype-based GWAS may be useful in extracting more information from the dataset and could contribute to the reduction in the missing heritability problem. Additionally, the most common and efficient model implemented in GWAS is the linear model with the random effect of polygene and fixed effects including marker and population structure such as region, age, etc. However, such models have encountered two issues: the background noise in genomics and the stringency and high rate of false-negatives after Bonferroni correction. Therefore, we adopted a linear mixed model recently developed by our laboratory, and we assumed a haplotype effect as random and to be normally distributed [6]. By using an empirical Bayesian (EB) method, the prior variance is the estimate from the same dataset, and the posterior mean is the best linear unbiased prediction (BLUP) of the marker effect. The present study conducted a haplotype-based GWAS with an empirical Bayesian method for milk yield traits in Shanghai Holsteins. We tried to analyze the blocks with 2, 3 and 4 SNPs, find the significant blocks, and identify the associated genes, pathways and networks important for the milk production trait to guide the improvement of dairy cattle breeding. Material and methods Population and phenotypes Approval by the Institutional Animal Care and Use Committee of Shanghai Jiao Tong University (contract no. 2015-07-0136) was given for all experimental procedures involving animals in the present study. A total of 1 092 cows were selected from 24 farms in Shanghai Bright Holstan Co., Ltd., with the following criteria: 1) primiparous cows born between 2001 and 2012 with the regular and standard performance of DHI (milk yield, fat percentage, protein percentage and somatic cell count); 2) age at first calving between 24 months and 36 months; and 3) test day from 5 to 335 DIM. The blood sa (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0192695&type=printable
Article home page: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0192695

Zhenliang Chen, Yunqiu Yao, Peipei Ma, Qishan Wang, Yuchun Pan. Haplotype-based genome-wide association study identifies loci and candidate genes for milk yield in Holsteins, PLOS ONE, 2018, Volume 13, Issue 2, DOI: 10.1371/journal.pone.0192695