Computing approximate standard errors for genetic parameters derived from random regression models fitted by average information REML
Troy M. F
1
2
Arthur R. G
0
1
Julius H.J. van der W
1
2
0
NSW Agriculture, Orange Agricultural Institute
,
Orange, NSW, 2800
,
Australia
1
Australian Sheep Industry CRC
2
School of Rural Science and Agriculture, University of New England
,
Armidale, NSW, 2351
,
Australia
- Approximate standard errors (ASE) of variance components for random regression coefficients are calculated from the average information matrix obtained in a residual maximum likelihood procedure. Linear combinations of those coefficients define variance components for the additive genetic variance at given points of the trajectory. Therefore, ASE of these components and heritabilities derived from them can be calculated. In our example, the ASE were larger near the ends of the trajectory.
1. INTRODUCTION
maximum likelihood (REML) methods. In contrast, Meyer [9] published
confidence intervals of genetic parameter estimates derived from Bayesian analyses
using Gibbs sampling. With REML estimation by the average information
algorithm, approximate variances of variance components are obtained from the
inverse of the information matrix. Variance components as well as
heritabilities at given trajectory points can be calculated from variances of random
regression coefficients and therefore approximate standard errors (ASE) of these
derived parameters can also be obtained. The aims of this note are to describe
how to calculate ASE for genetic parameter estimates derived from RR models
and to apply the method to a field data set.
2. MATERIALS AND METHODS
2.1. Random regression model
Consider a variance-covariance (VCV) matrix G0 of rank t for repeated
measurements of weight at t given trajectory points (e.g. ages). Under the
covariance function (CF) approach defined by Kirkpatrick et al. [5], G0 is
modelled with a reduced number of parameters. The genetic CF of order k, where
k t, can be estimated from G0 such that:
where G is an approximation of G0. Meyer [8] showed that K can be
estimated directly from data using RR. The matrix K of order k contains the
variance components for the RR coefficients in the model. The matrix of order
t k contains orthogonal polynomial coefficients evaluated at t standardised
trajectory points (ages) with elements ij = j(xi), being the jth polynomial
coefficient for the ith point xi [6]. The covariance structure for the
environmental effects is fitted as an unstructured t t covariance matrix. This yields the
model:
where yi is the vector of ti observations measured on animal i, b is a vector of
fixed effects, i a vector of additive genetic RR coefficients and ei a vector of
residual errors pertaining to yi. Xi and Zi are design matrices relating b and
i to yi, where Zi contains the elements pertaining to ages in the data.
Extending the model to n individuals, the corresponding variances are defined as
var() = K A, where K contains the additive genetic variances and
covariances for the RR coefficients, A is the numerator relationship matrix among
individuals and the symbol denotes direct product. The solution for K can
be used as in equation (1) to calculate the variances and covariances among
defined trajectory points.
2.2. Calculation of standard error of parameters derived from RR
coefficients
Consider a genetic variance covariance matrix, G , derived from
equation (1), G = K , where has dimension t k, K has dimension k k
and G is t t. We can write the elements of G in vector form, such that the
variances and covariances of these parameters can be summarized in a matrix.
Hence, equation (1) can also be written as
where has dimension (t t) (k k), vec( K) is the vector form of K
of dimensions (k k) 1 achieved by stacking the columns of K below one
another, and similarly vec(G ) is the vector form of G of dimensions (t t) 1.
It can be checked for a small example that equations (1) and (3) are equivalent,
written in matrix and vector form respectively. The variance of estimates in G
can be calculated in a similar manner whereby
where var(vec(K )) has dimensions (k k) (k k) and var(vec(G )) is a (t t) by
(t t) matrix. Var(vec(K )) can be approximated from the appropriate elements
of the inverse of the average information matrix in a REML procedure (e.g. as
given in the *.vvp file in ASReml) [3]. The same principles apply to other
random effects in the RR model, and the covariance between variances of different
random effects. Hence, this methodology can be extended to the matrices
estimated for other random regression effects and the covariance between random
effects. Subsequently these matrices are summed as in equation (5) to build
a matrix containing estimates of variance of phenotypic (co)variance
components, var(vec(P )), which also has dimension (t t) (t t).
= var vec G
+ 2 cov vec G , vec E
For functions of variance components (such as heritabilities) a Taylor series
expansion can be used to approximate the variance of a variance ratio as
detailed by Lynch and Walsh [7]. For the ratio of genetic to phenotypic variance,
var gi,i/pi,i = var hi2
pi2,ivar gi,i + gi2,ivar pi,i 2gi,ipi,icov gi,i, pi,i /pi4,i
where gi,i and pi,i are elements of G and P, var(gi,i), var(pi,i) and cov(gi,i, pi,i)
represent variance and covariance of genetic and phenotypic variance at time i.
The ASE for the heritability estimate at time i (for univariate and RR estimates)
is obtained by taking the square root of equation (6).
2.3. Example of application of method to RR coefficients estimated
from field data
s2000
d
r
co1500
e
R
fo1000
r
e
b 500
m
u
N
A VCV matrix for additive genetic and phenotypic effects for weight over
a 450-day trajectory was derived based on the analysis performed by Fischer
et al. [2]. Data for this analysis originated from the LAMBPLAN database and
consisted of 16 826 weight records on 5 420 Poll Dorset sheep. The number of
records at different ages is represented in Figure 1.
Fischer et al. [2] used RR to estimate CF coefficients for direct and
maternal genetic and environmental effects. The model also included
heterogeneous residual variance across ages of measurement. ASReml [3] was used
for this analysis. Based on a third order CF for additive genetic effects, a
VCV matrix (G ) was constructed for weights at 10 equidistant ages (i.e.
defining ). Similarly, VCV matrices were derived for the other random effects.
Furthermore, adding the resultant variance matrices together resulted in a
phenotypic VCV matrix (P ) with (co)variance components for weights at the
10 equidistant ages.
We then obtained the variance of vec(G ) as in equation (4) and similarly
for the two types of maternal effects, which in this case were all matrices of
dimensions 100 100. Following equations (4), (5) and (6) we obtained the
ASE of the heritability estimate for each age. Results from this example are
shown in Figure 2.
In addition, a series of piecewise estimates at specific ages were obtained
using the equivalent univariate model (direct and matern (...truncated)