Estimation of quantitative genetic parameters (pdf)

Article PDF cannot be displayed. You can download it here:

https://rspb.royalsocietypublishing.org/content/275/1635/679.full.pdf

Estimation of quantitative genetic parameters

Ro in Thompson * 0 1 2 0 Department of Biomathematics and Bioinformatics, Rothamsted Research , Harpenden AL5 2JQ , UK 1 Centre for Mathematical and Computational Biology 2 School of Mathematical Sciences, Queen Mary, University of London , Mile End Road, London E1 4NS , UK This paper gives a short review of the development of genetic parameter estimation over the last 40 years. The need to analyse genetic processes in both animal selection experiments and animal breeding improvement programmes motivated the majority of this work. The usage of animal model in conjunction with residual maximum likelihood (REML) techniques for mixed models has revolutionized the methods. These methods to estimate quantitative genetic parameters have recently been advocated for use in evolutionary studies of natural populations. Therefore, it is perhaps timely to discuss the development of REML methods and their application to the analysis of artificial selection experiments and breeding programmes in animals. This should give extra insight into the methods and hopefully lead to synergy between both the areas. 1. INTRODUCTION Given the recent enthusiasm for the usage of residual maximum likelihood (REML) techniques for the analysis of mixed models in conjunction with the animal model for the genetic analysis of natural populations (Kruuk 2004), it is perhaps timely to review the development of these techniques in which the author has been actively involved. In writing this paper, the author had two aims. The first aim is to explain some background of the development of the REML method. It has been explained in 2 by discussing the animal breeding problems that motivated the authors interest and why existing techniques needed improvement. How the technique best linear unbiased prediction (BLUP) helps both in computational terms and at a conceptual level is also shown. It is also explained how when taking account of the structure of the model, resulting prediction equations and likelihood can make it easier to estimate parameters. Finally, it is shown how the computational methods have evolved and how the variety of genetic models that can be estimated has increased. The second aim is to review the animal quantitative genetics work on selection and relate it to evolutionary studies with the hope of contrasting the strengths of the two approaches. In 3, it is shown how the REML techniques were extended to deal with animal selection experiments and improvement programmes with emphasis on the estimation of variance parameters taking account of selection and prediction of breeding values. In 4, these approaches are compared with those used in evolutionary studies with emphasis on measuring selection responses. It is hoped that this might lead to synergy between the analysis of artificial and natural populations. 2. ESTIMATION IN MIXED MODELS (a) Motivation In the early 1950s, a system of dairy cattle genetic improvement programmes in the UK was introduced which followed the paradigm of choosing the best of tested young bulls to father more daughters and the next generation of young bulls. The genetic merit of young bulls for milk production was evaluated by comparing the milk production of their daughters with that of daughters of other bulls. Owing to the possible environmental effects of herd, year and season, daughters were grouped into contemporary groups, i.e. cows in the same herd, year and season. The subsequent design was very unbalanced with a small contemporary group size and although each bull was used in many herds, all bulls were not used in each contemporary group. When first introduced, the computational facilities were very limited and the evaluation procedure was two-staged. In the first stage, the daughter means were corrected for the environmental effects of contemporary groups. In the second stage, the breeding values of the sires were predicted by regressing these adjusted progeny means to the overall mean, with the regression coefficient depending on the additive genetic variance and a measure of residual variance in the progeny mean. In effect, this was a mixed model with contemporary groups as fixed effects, random sire effects and the variance of the sire effects depending on the additive variance. The emphasis was not only mainly on evaluating the bulls with the contemporary effects, just as a nuisance factor, but there was also interest in estimating genetic variances and covariances between traits. In the 1970s, the computing facilities were improving. It became computationally feasible to make better adjustments for the contemporary group effects taking account of the genetic differences between contemporary groups arising owing to unbalance (Thompson 1976). (b) Early analysis Mixed models had been used implicitly in agricultural experiments since the 1930s. The emphasis was on estimating experimental treatment effects (fixed effects) as accurately as possible by taking account of random effects. The random effects depend on the context. In animal experiments on pigs, one might standardize the litter size and apply different treatments to the individual pigs in each litter and treat litter effects as random effects. In crop experiments, the random effects might be blocks or groups of similar plots. If the number of treatments is larger than the size of the litter, then one would like to use a design allocating treatments to pigs within litters which allowed the variances of all treatment comparisons corrected for litter effects to be the same. Yates introduced designs to do this, called balanced incomplete block (BIB) designs. Further there are two possible types of comparisons, one within litters, which have no litter component, and one from comparisons of litter totals which include different sets of treatments, which do have a litter component. One would like to combine this information and take account of the different variances in the different comparisons. Yates (1940) suggested constructing an efficient weighted average of these two estimates and called this the recovery of inter-block information. Yates used an analysis for BIB designs, based on an analysis of variance, by first separating variation into two parts, or strata, within and between blocks (the latter component is also termed interblock). The weights depend on the variation in the two strata and the variance of the treatment effect in each stratum. There were three sources of information to estimate the variation: the two residual sums of squares in the two strata and the comparison of treatment effects in the two strata. Later Nelder (1968) extended the method to generally balanced designs which allowed more than two sources of variation, or strata, and allowed the treatment effects to be partitioned into sets with different efficiencies. In these examples, estimation of variance components was essentially used to provide appropriate weights for the fixed effects and give appropriate stan (...truncated)