Does the DNA barcoding gap exist? – a case study in blue butterflies (Lepidoptera: Lycaenidae) (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.frontiersinzoology.com/content/pdf/1742-9994-4-8.pdf

Does the DNA barcoding gap exist? – a case study in blue butterflies (Lepidoptera: Lycaenidae)

Frontiers in Zoology Does the DNA barcoding gap exist? - a case study in blue butterflies (Lepidoptera: Lycaenidae) Martin Wiemers* and Konrad Fiedler 0 Address: Department of Population Ecology, Faculty of Life Sciences, University of Vienna , Althanstrasse 14, 1090 Vienna , Austria Background: DNA barcoding, i.e. the use of a 648 bp section of the mitochondrial gene cytochrome c oxidase I, has recently been promoted as useful for the rapid identification and discovery of species. Its success is dependent either on the strength of the claim that interspecific variation exceeds intraspecific variation by one order of magnitude, thus establishing a "barcoding gap", or on the reciprocal monophyly of species. Results: We present an analysis of intra- and interspecific variation in the butterfly family Lycaenidae which includes a well-sampled clade (genus Agrodiaetus) with a peculiar characteristic: most of its members are karyologically differentiated from each other which facilitates the recognition of species as reproductively isolated units even in allopatric populations. The analysis shows that there is an 18% overlap in the range of intra- and interspecific COI sequence divergence due to low interspecific divergence between many closely related species. In a Neighbour-Joining tree profile approach which does not depend on a barcoding gap, but on comprehensive sampling of taxa and the reciprocal monophyly of species, at least 16% of specimens with conspecific sequences in the profile were misidentified. This is due to paraphyly or polyphyly of conspecific DNA sequences probably caused by incomplete lineage sorting. Conclusion: Our results indicate that the "barcoding gap" is an artifact of insufficient sampling across taxa. Although DNA barcodes can help to identify and distinguish species, we advocate using them in combination with other data, since otherwise there would be a high probability that sequences are misidentified. Although high differences in DNA sequences can help to identify cryptic species, a high percentage of well-differentiated species has similar or even identical COI sequences and would be overlooked in an isolated DNA barcoding approach. - Background Molecular tools have provided a plethora of new opportunities to study questions in evolutionary biology (e.g. speciation processes) and in phylogenetic systematics. Only recently, however, have claims been made that the sequencing of a small (648 bp) fragment at the 5' end of the gene cytochrome c oxidase subunit 1 (COI or cox1) from the mitochondrial genome would be sufficient in most Metazoa to identify them to the species level [1,2]. This approach called "DNA barcoding" has gained momentum and the "Consortium for the Bar Code of Life (CBOL)" founded in September 2004 intends to create a global biodiversity barcode database in order to facilitate automated species identifications. Right from the start, however, this approach received opposition, especially from the taxonomists' community [3-8]. Some arguments in this debate are political in nature, others have a scientific basis. Concerning the latter, one of the most essential arguments focuses on the so-called "barcoding gap". Advocates of barcoding claim that interspecific genetic variation exceeds intraspecific variation to such an extent that a clear gap exists which enables the assignment of unidentified individuals to their species with a negligible error rate [1,9,10]. The errors are attributed to a small number of incipient species pairs with incomplete lineage sorting (e.g. [11]). As a consequence, establishing the degree of sequence divergence between two samples above a given threshold (proposed to be at least 10 times greater than within species [10]) would indicate specific distinctness, whereas divergence below such a threshold would indicate taxonomic identity among the samples. Furthermore, the existence of a barcoding gap would even enable the identification of previously undescribed species ([11-13] but see [14]). Possible errors of this approach include false positives and false negatives. False positives occur if populations within one species are genetically quite distinct, e.g. in distant populations with limited gene flow or in allopatric populations with interrupted gene flow. In the latter case it must be noted that, depending on the amount of morphological differentiation and the species concept to be applied, such populations may also qualify as 'cryptic species' in the view of some scientists. False negatives, in contrast, occur when little or no sequence variation in the barcoding fragment is found between different biospecies (= reproductively isolated population groups sensu Mayr [15]). Hence, false negatives are more critical for the barcoding approach, because the existence of such cases would reveal examples where the barcoding approach is less powerful than the use of other and more holistic approaches to delimit species boundaries. Initial studies on birds [10] and arthropods [9,16] appeared to corroborate the existence of a distinct barcoding gap, but two recent studies on gastropods [17] and flies [18] challenge its existence. The reasons for these discrepancies are not entirely clear. Although levels of COI sequence divergence differ between higher taxa (e.g. an exceptionally low mean COI sequence divergence of only 1.0% was found in congeneric species pairs of Cnidaria compared to 9.615.7% in other animal phyla [2]), Mollusca (with 11.1% mean sequence divergence between species) and Diptera (9.3%) are not peculiar in this respect. Meyer & Paulay [17] assume that insufficient sampling on both the interspecific and intraspecific level create the artifact of a barcode gap. Proponents of barcoding might argue, however, that the main reason for this overlap is the poor taxonomy of these groups, e.g. cryptic species may have been overlooked which are differentiated genetically but very similar or even identical in morphology. If the barcode gap does not exist, then the threshold approach in barcoding becomes inapplicable. Although more sophisticated techniques (e.g. using coalescence theory and statistical population genetic methods [19-21]) can sometimes help to delimit species with overlapping genetic divergences, these approaches require additional assumptions (e.g. about the choice of population genetic models or clustering algorithms) and are only feasible in well-sampled clades. Barcoding holds promise nonetheless especially in the identification of arthropods, the most species-rich animal phylum in terrestrial ecosystems. Identification of arthropods is often extremely time-consuming and generally requires taxonomic specialists for any given group. Moreover, the fraction of undescribed species is particularly high, as opposed to vertebrates. Hence, there is substantial demand for improved (and rapid) identification tools by scientists who seek identification of large arthropod samples from complex (...truncated)