Genotyping errors in a calibrated DNA register: implications for identification of individuals (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2156-12-36.pdf

Genotyping errors in a calibrated DNA register: implications for identification of individuals

Haaland et al. BMC Genetics 2011, 12:36 http://www.biomedcentral.com/1471-2156/12/36 RESEARCH ARTICLE Open Access Genotyping errors in a calibrated DNA register: implications for identification of individuals Øystein A Haaland1, Kevin A Glover2, Bjørghild B Seliussen2 and Hans J Skaug1,2* Abstract Background: The use of DNA methods for the identification and management of natural resources is gaining importance. In the future, it is likely that DNA registers will play an increasing role in this development. Microsatellite markers have been the primary tool in ecological, medical and forensic genetics for the past two decades. However, these markers are characterized by genotyping errors, and display challenges with calibration between laboratories and genotyping platforms. The Norwegian minke whale DNA register (NMDR) contains individual genetic profiles at ten microsatellite loci for 6737 individuals captured in the period 1997-2008. These analyses have been conducted in four separate laboratories for nearly a decade, and offer a unique opportunity to examine genotyping errors and their consequences in an individual based DNA register. We re-genotyped 240 samples, and, for the first time, applied a mixed regression model to look at potentially confounding effects on genotyping errors. Results: The average genotyping error rate for the whole dataset was 0.013 per locus and 0.008 per allele. Errors were, however, not evenly distributed. A decreasing trend across time was apparent, along with a strong withinsample correlation, suggesting that error rates heavily depend on sample quality. In addition, some loci were more error prone than others. False allele size constituted 18 of 31 observed errors, and the remaining errors were ten false homozygotes (i.e., the true genotype was a heterozygote) and three false heterozygotes (i.e., the true genotype was a homozygote). Conclusions: To our knowledge, this study represents the first investigation of genotyping error rates in a wildlife DNA register, and the first application of mixed models to examine multiple effects of different factors influencing the genotyping quality. It was demonstrated that DNA registers accumulating data over time have the ability to maintain calibration and genotyping consistency, despite analyses being conducted on different genotyping platforms and in different laboratories. Although errors were detected, it is demonstrated that if the re-genotyping of individual samples is possible, these will have a minimal effect on the database’s primary purpose, i.e., to perform individual identification. Keywords: Calibration DNA register, genotyping error, microsatellite, minke whale, mixed logistic regression, wildlife Background Microsatellites, also known as short tandem repeats (STRs), are repeating sequences of DNA where the repeat motif includes 1-6 bases [1,2]. Variation in the number of repetitions within the sequence forms the basis of the alleles. Since their discovery in the 1980’s, microsatellite DNA markers have been a prominent tool * Correspondence: 1 Department of Mathematics, University of Bergen, Johannes Brunsgate 12, 5008 Bergen, Norway Full list of author information is available at the end of the article in ecological, medical and forensic genetics, among other things because of their high levels of variability, co-dominant inheritance, and abundance in most organisms [3-5]. Microsatellites are almost exclusively genotyped by amplification of the DNA sequence via the polymerase chain reaction, which is subsequently subject to electrophoresis and sized (i.e., length of repeat) in relation to known DNA fragments (i.e., the size standard). The relative migratory properties of the microsatellite fragment to the DNA size standard is influenced by a range © 2011 Haaland et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Haaland et al. BMC Genetics 2011, 12:36 http://www.biomedcentral.com/1471-2156/12/36 Page 2 of 10 of factors and is dependent on the conditions under which the electrophoresis is performed [6]. In part due to the way in which microsatellites are genotyped, this class of markers is prone to genotyping errors [7], which occur when the observed genotype does not correspond to the real genotype [8]. Genotyping errors in microsatellites cannot be avoided completely, and have a range of origins including scoring mistakes, contaminated multiplex assays, biochemical anomalies, and degenerated DNA samples [9]. Error rates in the range 0.0050.01 per locus have frequently been reported in the literature [9]. Furthermore, error rates as low as 0.002 per locus are non-negligible, and may lead to false conclusions about, for example, confidence in assigned paternities [10]. The implementation of DNA based methods for the identification and management of wildlife resources represents a broad and rapidly growing field. In the future, it is likely that DNA-registers are going to become an increasingly important component of this development. For example, DNA-registers may contain information about animal pedigrees in living gene banks for conservation of endangered species, and monitor trade in wildlife [11,12]. DNA registers may be built upon a multitude of approaches and genetic markers, for example, relying upon allele frequencies for population identification [13], exact genotype profiles for individual identification [12], as well as sequence recognition for species identification in DNA barcoding [14]. Irrespective of primary purpose, a common feature of all DNA-registers is the fact that they accumulate data over time. This generates special challenges to data acquisition and quality, not least because developments in genotyping platforms and technology over time may cause calibration and continuity issues. Despite having similar genotyping equipment, different laboratories may still produce deviating allelic values for microsatellites on the same locus [6,15-17]. DNA-registers should be annotated with estimates of genotyping error rates from blinded experiments. For the purpose of matching profiles against a DNA-register, it is the across-profile error rate that is of importance, not the per-locus rates. If loci can be assumed independent, the former is given as pprofile = 1 − L (1 − pl ), (1) l when there are L loci, and pl is the error rate at locus l. However, when loci are positively correlated, i.e., the fact that an error occurs at one locus increases the error rate on other loci, the profile-wise error will exceed the value given by (1). We propose to account for this using a mixed regression model. The Norwegian minke whale (Balaenoptera acutorostrata) DNA-register (NMDR) consists of individual DNA pro (...truncated)