Estimation of quantitative genetic parameters
Ro
in Thompson
*
0
1
2
0
Department of Biomathematics and Bioinformatics, Rothamsted Research
,
Harpenden AL5 2JQ
,
UK
1
Centre for Mathematical and Computational Biology
2
School of Mathematical Sciences, Queen Mary, University of London
,
Mile End Road, London E1 4NS
,
UK
This paper gives a short review of the development of genetic parameter estimation over the last 40 years. The need to analyse genetic processes in both animal selection experiments and animal breeding improvement programmes motivated the majority of this work. The usage of animal model in conjunction with residual maximum likelihood (REML) techniques for mixed models has revolutionized the methods. These methods to estimate quantitative genetic parameters have recently been advocated for use in evolutionary studies of natural populations. Therefore, it is perhaps timely to discuss the development of REML methods and their application to the analysis of artificial selection experiments and breeding programmes in animals. This should give extra insight into the methods and hopefully lead to synergy between both the areas.
1. INTRODUCTION
Given the recent enthusiasm for the usage of residual
maximum likelihood (REML) techniques for the analysis
of mixed models in conjunction with the animal model for
the genetic analysis of natural populations (Kruuk 2004),
it is perhaps timely to review the development of these
techniques in which the author has been actively involved.
In writing this paper, the author had two aims.
The first aim is to explain some background of the
development of the REML method. It has been explained
in 2 by discussing the animal breeding problems that
motivated the authors interest and why existing
techniques needed improvement. How the technique best
linear unbiased prediction (BLUP) helps both in
computational terms and at a conceptual level is also shown. It is
also explained how when taking account of the structure of
the model, resulting prediction equations and likelihood
can make it easier to estimate parameters. Finally, it is
shown how the computational methods have evolved and
how the variety of genetic models that can be estimated
has increased.
The second aim is to review the animal quantitative
genetics work on selection and relate it to evolutionary
studies with the hope of contrasting the strengths of the
two approaches. In 3, it is shown how the REML
techniques were extended to deal with animal selection
experiments and improvement programmes with
emphasis on the estimation of variance parameters taking
account of selection and prediction of breeding values.
In 4, these approaches are compared with those used in
evolutionary studies with emphasis on measuring selection
responses. It is hoped that this might lead to synergy
between the analysis of artificial and natural populations.
2. ESTIMATION IN MIXED MODELS
(a) Motivation
In the early 1950s, a system of dairy cattle genetic
improvement programmes in the UK was introduced
which followed the paradigm of choosing the best of tested
young bulls to father more daughters and the next
generation of young bulls. The genetic merit of young
bulls for milk production was evaluated by comparing the
milk production of their daughters with that of daughters
of other bulls. Owing to the possible environmental effects
of herd, year and season, daughters were grouped into
contemporary groups, i.e. cows in the same herd, year and
season. The subsequent design was very unbalanced with
a small contemporary group size and although each bull
was used in many herds, all bulls were not used in each
contemporary group. When first introduced, the
computational facilities were very limited and the evaluation
procedure was two-staged. In the first stage, the daughter
means were corrected for the environmental effects of
contemporary groups. In the second stage, the breeding
values of the sires were predicted by regressing these
adjusted progeny means to the overall mean, with the
regression coefficient depending on the additive genetic
variance and a measure of residual variance in the progeny
mean. In effect, this was a mixed model with
contemporary groups as fixed effects, random sire effects and the
variance of the sire effects depending on the additive
variance. The emphasis was not only mainly on evaluating
the bulls with the contemporary effects, just as a nuisance
factor, but there was also interest in estimating genetic
variances and covariances between traits. In the 1970s, the
computing facilities were improving. It became
computationally feasible to make better adjustments for the
contemporary group effects taking account of the genetic
differences between contemporary groups arising owing to
unbalance (Thompson 1976).
(b) Early analysis
Mixed models had been used implicitly in agricultural
experiments since the 1930s. The emphasis was on
estimating experimental treatment effects (fixed effects)
as accurately as possible by taking account of random
effects. The random effects depend on the context. In
animal experiments on pigs, one might standardize the
litter size and apply different treatments to the individual
pigs in each litter and treat litter effects as random effects.
In crop experiments, the random effects might be blocks
or groups of similar plots. If the number of treatments is
larger than the size of the litter, then one would like to use
a design allocating treatments to pigs within litters which
allowed the variances of all treatment comparisons
corrected for litter effects to be the same. Yates introduced
designs to do this, called balanced incomplete block (BIB)
designs. Further there are two possible types of
comparisons, one within litters, which have no litter component,
and one from comparisons of litter totals which include
different sets of treatments, which do have a litter
component. One would like to combine this information
and take account of the different variances in the different
comparisons. Yates (1940) suggested constructing an
efficient weighted average of these two estimates and
called this the recovery of inter-block information. Yates
used an analysis for BIB designs, based on an analysis of
variance, by first separating variation into two parts, or
strata, within and between blocks (the latter component is
also termed interblock). The weights depend on the
variation in the two strata and the variance of the
treatment effect in each stratum. There were three sources
of information to estimate the variation: the two residual
sums of squares in the two strata and the comparison of
treatment effects in the two strata. Later Nelder (1968)
extended the method to generally balanced designs which
allowed more than two sources of variation, or strata, and
allowed the treatment effects to be partitioned into sets
with different efficiencies. In these examples, estimation of
variance components was essentially used to provide
appropriate weights for the fixed effects and give
appropriate stan (...truncated)