Bayesian shrinkage mapping of quantitative trait loci in variance component models (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2156-11-30.pdf

Bayesian shrinkage mapping of quantitative trait loci in variance component models

BMC Genetics MBeathyodeosloigay narticslehrinkage mapping of quantitative trait loci in variance component models Ming Fang 0 1 0 Life Science College, Heilongjiang August First Land Reclamation University , Daqing, 163319 , China 1 Department of Animal Genetics and breeding, College of Animal Science and Technology, China Agricultural University , Beijing 100193 , China Background: In this article, I propose a model-selection-free method to map multiple quantitative trait loci (QTL) in variance component model, which is useful in outbred populations. The new method can estimate the variance of zero-effect QTL infinitely to zero, but nearly unbiased for non-zero-effect QTL. It is analogous to Xu's Bayesian shrinkage estimation method, but his method is based on allelic substitution model, while the new method is based on the variance component models. Results: Extensive simulation experiments were conducted to investigate the performance of the proposed method. The results showed that the proposed method was efficient in mapping multiple QTL simultaneously, and moreover it was more competitive than the reversible jump MCMC (RJMCMC) method and may even out-perform it. Conclusions: The newly developed Bayesian shrinkage method is very efficient and powerful for mapping multiple QTL in outbred populations. - Background There are two kinds of models which can be used to map QTL in outbred populations, the allelic substitution model [1-3] and the variance component model [4-7]. In the allelic substitution model, the number of QTL alleles is assumed to be known, and the QTL allelic substitution is estimated by the given linkage phases of parents, which can be inferred from genotypes of family members. The least square [2] and maximum likelihood [1,3,8] of interval mapping are two popular statistical approaches for such models. Compared with the allelic substitute model, the variance component model is more robust because it can handle an arbitrary number of alleles with arbitrary modes of gene actions[9]. Moreover, the linkage phase of parents is unnecessary, which is nice since it is hard to accurately infer, particularly when family size is small, such as with human populations. Therefore, the variance component model is usually used to map QTL in outbred populations [4-7,9-12]. In the variance component model, the identity-based-decent (IBD) matrix may be different for each locus and can provide information to localize the QTL. The least square method [10,11] and the maximum likelihood method [4,13] are also two important statistical methods for handling this model. Because of the polygenic nature of quantitative traits, multiple QTL mapping is a problem of model selection. The least square method and the maximum likelihood method can nicely handle single QTL model, but is difficult for them to handle multiple QTL model. Recently, the Bayesian reversible jump MCMC (RJMCMC) method has been used to map multiple QTL in the variance component model [9,12]. However, it still has some disadvantages. Because the model dimension is variable, it usually has poor mixing character and is difficult to converge [14-16]; moreover, it is also difficult to explore all the model space, especially in genome-wide mapping where thousands of possible locus are scanned [16]. Therefore, in this article I proposed a model-dimension-fixed method, in which the estimate of variance is very precise for nonzero-effect QTL, and gradually converges to zero for zero-effect QTL. Therefore, special model selection is needless. It is similar to the recent Bayesian shrinkage estimation methods [15,17-19], which are based on the allelic substitution model, whereas my method is based on the variance component model. The efficiency of the new method is demonstrated by a series of simulation experiments. Method Genetic model Suppose that one has a sample of n individuals from outbred populations. Assuming that QTL dominant effect and polygenic dominant effect are absent. Then the linear model can be expressed as where, y is the n 1 phenotypic vector; is the k 1 vector of covariate effects; k is the number of the covariate; X is the n k design matrix related to the covariate QTL effect, for j = 1,2, ..., q, where j is the IBD matrix and can be inferred by the conditional expectation approach [20]; and s 2j is the QTL variance; e ~ N (0, Ins e2 ) is the vector of random error, where In, is the n n identity matrix and s e2 is the residual variance; q is the maximum QTL number, which is set beforehand; g ~ N(0, As A2 ) is the n 1 vector of random polygenic effect, here A is the additive relationship matrix and s A2 is the polygenic additive variance, the polygenic term g may be excluded from equation (1) in genome-wide mapping. The variance component model can be expressed as Similarly, the term of polygenic variance As A2 should also be excluded from equation (2) in genome-wide mapping. Prior specification and joint posterior distribution Yi and Xu [9] assigned a uniform prior distribution for Jeffreys' hyper prior p(s 2j) ~ 1 / s 2j is assumed. The special prior is the key in the new method and will be illustrated in detail later. The prior for polygenic variance and residual variance is assumed to follow scaled inverted chi-square distribution with degree of freedom and scaled parameter s2 (see also [21] for detail); and the prior for covariant effect and QTL position j are assumed to follow normal distribution, ~ N (0, V0), and uniform distribution, respectively. The joint posterior distribution is given in Appendix. Updating QTL variance by random walk MetropolisHastings algorithm Because there is no close form for the posterior distribualgorithm [22,23] is used to simulate it. I firstly propose a new QTL variance and then accept it according to its acceptance probability. Generating the new proposal QTL variance I employ the Browne's method [24], a special random walk Metropolis-Hastings algorithm (RWM-H) to update posed and sampled from a scaled inverted chi-squared distribution, conditional on the current value of QTL equals the expectation of the current value s 2j(t) , i.e. variance is accepted according to its accept probability. Since the new generated value closely relies on the old one, this approach is a special case of RWM-H, and the degree of freedom is equivalent to the tuning parameter [21,24]. Calculating the acceptance probability The new proposal QTL variance is accepted with probability equal to min (1, r), where, r = hr, and s 2j represents all elements of except s 2j . In equation (3), the first term is likelihood, the second is prior and the third is called proposal ratio or Hastings ratio [23]. Because the proposal distribution is not symhr = p(s 2j(t) s 2j()) p(s 2j() s 2j(t)) Inv 2 n ,(n 2)s 2j() /n Inv 2 n ,(n 2)s 2j(t) /n MCMC implementations The implementations of the MCMC algorithm are summarized as follows: a. Initialize all parameters from lega (...truncated)