Genotyping errors in a calibrated DNA register: implications for identification of individuals
Haaland et al. BMC Genetics 2011, 12:36
http://www.biomedcentral.com/1471-2156/12/36
RESEARCH ARTICLE
Open Access
Genotyping errors in a calibrated DNA register:
implications for identification of individuals
Øystein A Haaland1, Kevin A Glover2, Bjørghild B Seliussen2 and Hans J Skaug1,2*
Abstract
Background: The use of DNA methods for the identification and management of natural resources is gaining
importance. In the future, it is likely that DNA registers will play an increasing role in this development.
Microsatellite markers have been the primary tool in ecological, medical and forensic genetics for the past two
decades. However, these markers are characterized by genotyping errors, and display challenges with calibration
between laboratories and genotyping platforms. The Norwegian minke whale DNA register (NMDR) contains
individual genetic profiles at ten microsatellite loci for 6737 individuals captured in the period 1997-2008. These
analyses have been conducted in four separate laboratories for nearly a decade, and offer a unique opportunity to
examine genotyping errors and their consequences in an individual based DNA register. We re-genotyped 240
samples, and, for the first time, applied a mixed regression model to look at potentially confounding effects on
genotyping errors.
Results: The average genotyping error rate for the whole dataset was 0.013 per locus and 0.008 per allele. Errors
were, however, not evenly distributed. A decreasing trend across time was apparent, along with a strong withinsample correlation, suggesting that error rates heavily depend on sample quality. In addition, some loci were more
error prone than others. False allele size constituted 18 of 31 observed errors, and the remaining errors were ten
false homozygotes (i.e., the true genotype was a heterozygote) and three false heterozygotes (i.e., the true
genotype was a homozygote).
Conclusions: To our knowledge, this study represents the first investigation of genotyping error rates in a wildlife
DNA register, and the first application of mixed models to examine multiple effects of different factors influencing
the genotyping quality. It was demonstrated that DNA registers accumulating data over time have the ability to
maintain calibration and genotyping consistency, despite analyses being conducted on different genotyping
platforms and in different laboratories. Although errors were detected, it is demonstrated that if the re-genotyping
of individual samples is possible, these will have a minimal effect on the database’s primary purpose, i.e., to
perform individual identification.
Keywords: Calibration DNA register, genotyping error, microsatellite, minke whale, mixed logistic regression,
wildlife
Background
Microsatellites, also known as short tandem repeats
(STRs), are repeating sequences of DNA where the
repeat motif includes 1-6 bases [1,2]. Variation in the
number of repetitions within the sequence forms the
basis of the alleles. Since their discovery in the 1980’s,
microsatellite DNA markers have been a prominent tool
* Correspondence:
1
Department of Mathematics, University of Bergen, Johannes Brunsgate 12,
5008 Bergen, Norway
Full list of author information is available at the end of the article
in ecological, medical and forensic genetics, among
other things because of their high levels of variability,
co-dominant inheritance, and abundance in most organisms [3-5].
Microsatellites are almost exclusively genotyped by
amplification of the DNA sequence via the polymerase
chain reaction, which is subsequently subject to electrophoresis and sized (i.e., length of repeat) in relation to
known DNA fragments (i.e., the size standard). The
relative migratory properties of the microsatellite fragment to the DNA size standard is influenced by a range
© 2011 Haaland et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
Haaland et al. BMC Genetics 2011, 12:36
http://www.biomedcentral.com/1471-2156/12/36
Page 2 of 10
of factors and is dependent on the conditions under
which the electrophoresis is performed [6]. In part due
to the way in which microsatellites are genotyped, this
class of markers is prone to genotyping errors [7], which
occur when the observed genotype does not correspond
to the real genotype [8]. Genotyping errors in microsatellites cannot be avoided completely, and have a range
of origins including scoring mistakes, contaminated
multiplex assays, biochemical anomalies, and degenerated DNA samples [9]. Error rates in the range 0.0050.01 per locus have frequently been reported in the literature [9]. Furthermore, error rates as low as 0.002 per
locus are non-negligible, and may lead to false conclusions about, for example, confidence in assigned paternities [10].
The implementation of DNA based methods for the
identification and management of wildlife resources
represents a broad and rapidly growing field. In the
future, it is likely that DNA-registers are going to
become an increasingly important component of this
development. For example, DNA-registers may contain
information about animal pedigrees in living gene banks
for conservation of endangered species, and monitor
trade in wildlife [11,12]. DNA registers may be built
upon a multitude of approaches and genetic markers,
for example, relying upon allele frequencies for population identification [13], exact genotype profiles for individual identification [12], as well as sequence
recognition for species identification in DNA barcoding
[14]. Irrespective of primary purpose, a common feature
of all DNA-registers is the fact that they accumulate
data over time. This generates special challenges to data
acquisition and quality, not least because developments
in genotyping platforms and technology over time may
cause calibration and continuity issues. Despite having
similar genotyping equipment, different laboratories may
still produce deviating allelic values for microsatellites
on the same locus [6,15-17].
DNA-registers should be annotated with estimates of
genotyping error rates from blinded experiments. For
the purpose of matching profiles against a DNA-register,
it is the across-profile error rate that is of importance,
not the per-locus rates. If loci can be assumed independent, the former is given as
pprofile = 1 −
L
(1 − pl ),
(1)
l
when there are L loci, and pl is the error rate at locus
l. However, when loci are positively correlated, i.e., the
fact that an error occurs at one locus increases the error
rate on other loci, the profile-wise error will exceed the
value given by (1). We propose to account for this using
a mixed regression model.
The Norwegian minke whale (Balaenoptera acutorostrata) DNA-register (NMDR) consists of individual DNA
pro (...truncated)