Single mitochondrial gene barcodes reliably identify sister-species in diverse clades of birds
BMC Evolutionary Biology
Single mitochondrial gene barcodes reliably identify sister-species in diverse clades of birds
Erika S Tavares 1
Allan J Baker 0 1
0 Department of Ecology and Evolutionary Biology, University of Toronto , Toronto , Canada
1 Department of Natural History, Royal Ontario Museum , 100 Queen's Park, Toronto , Canada
Background: DNA barcoding of life using a standardized COI sequence was proposed as a species identification system, and as a method for detecting putative new species. Previous tests in birds showed that individuals can be correctly assigned to species in ~94% of the cases and suggested a threshold of 10 mean intraspecific difference to detect potential new species. However, these tests were criticized because they were based on a single maternally inherited gene rather than multiple nuclear genes, did not compare phylogenetically identified sister species, and thus likely overestimated the efficacy of DNA barcodes in identifying species. Results: To test the efficacy of DNA barcodes we compared ~650 bp of COI in 60 sister-species pairs identified in multigene phylogenies from 10 orders of birds. In all pairs, individuals of each species were monophyletic in a neighbor-joining (NJ) tree, and each species possessed fixed mutational differences distinguishing them from their sister species. Consequently, individuals were correctly assigned to species using a statistical coalescent framework. A coalescent test of taxonomic distinctiveness based on chance occurrence of reciprocal monophyly in two lineages was verified in known sister species, and used to identify recently separated lineages that represent putative species. This approach avoids the use of a universal distance cutoff which is invalidated by variation in times to common ancestry of sister species and in rates of evolution. Conclusion: Closely related sister species of birds can be identified reliably by barcodes of fixed diagnostic substitutions in COI sequences, verifying coalescent-based statistical tests of reciprocal monophyly for taxonomic distinctiveness. Contrary to recent criticisms, a single DNA barcode is a rapid way to discover monophyletic lineages within a metapopulation that might represent undiscovered cryptic species, as envisaged in the unified species concept. This identifies a smaller set of lineages that can also be tested independently for species status with multiple nuclear gene approaches and other phenotypic characters.
-
Background
Large scale sequencing of a predefined region of
approximately 650 (base pairs) bp of the mitochondrial gene
COI, known as DNA barcoding, has two main goals: 1) to
develop a species identification system that also allows
unknown individuals to be assigned to species; 2) and to
enhance the discovery of new species [1-3]. Although
DNA barcoding has proved effective in achieving both
goals in several large groups of animals [4-11], the efficacy
of the tests have been questioned [12-16].
required for inferences to be made about taxonomic
distinctiveness from observations of monophyly [19].
A major test performed on 643 previously recognized
species of birds of North America demonstrated the
effectiveness of DNA barcoding because 94% possessed unique
monophyletic COI clusters [10,11]. The remaining 6% of
the species did not have unique DNA barcodes, indicating
that they either were (a) wrongly identified in the past as
separate species, (b) closely related species that hybridize
regularly, or (c) species losing identity by secondary
contact [11]. These groups may be in the indeterminate zone
between differentiated populations and distinct species
[10,11]. Critics of DNA barcoding claim that in spite of
the impressive number of bird species sampled [11], the
precision of the method was compromised due to
insufficient intraspecific sampling, and because comparisons
among species were not exclusively from sister-species
pairs [12,15,17], where taxonomic uncertainty,
interspecific hybridization, and incomplete lineage sorting could
decrease the effectiveness of the test [12]. The suggested
threshold of 10 times the mean intraspecific variation (10
rule) to screen for splits referred to as 'putative' species
[11] has also been criticized. Moritz and Cicero [12]
reported significantly lower average mitochondrial DNA
distances between sister species of birds than levels
reported in the barcoding tests of birds [10,11], although
the distances from these sister-species comparisons came
from a variety of methods and genes [7]. Meyer and
Paulay [13] tested different threshold methods in COI
barcodes of cowries and found extensive overlap of
overall intraspecific distances with interspecific distances,
resulting in minimum error rates of ~17% to screen for
putative new species. Additionally, a simulation study
using the neutral coalescent and the
BatesonDobzhansky-Muller (BDM) model of speciation
suggested that mtDNA barcodes will have error rates lower
than 10% in assigning individuals to species only when
populations have been isolated for more than 4 million
generations [15]. A universal-distance cutoff is therefore
not an objective criterion to delineate species limits [18].
Additionally, Hickerson et al. [15] argued that reciprocal
monophyly of mtDNA sequences and the 10 threshold
will likely underestimate species diversity [15]. Tree-based
approaches with genetic distances that use reciprocal
monophyly for species delimitation can be problematic
because aggregations of haplotypes in phylogenetic trees,
even when highly supported, do not necessarily imply
that they belong to a distinctive taxonomic unit [19]. To
address these issues, Rosenberg [19] proposed a statistical
test to test if monophyletic groups in a phylogenetic tree
are more likely to represent distinctive taxonomical
entities, or are just random branches of lineages within a
species. This approach also suggests minimal sample sizes
Some of the advantages of using a single mtDNA barcode
to identify species are that it has a higher rate of evolution
(and thus more mutations), and because matrilineal
lineages sort into reciprocally monophyletic clades much
faster than nuclear genes [20]. This reduces the incidence
of incompletely sorted lineages relative to that expected
with nuclear genes. However, recent simulations with
multiple nuclear genes indicate that very recently derived
species can be identified well before the time to reciprocal
monophyly [21]. Additionally, species were correctly
delimited in <50% of replicates simulating mtDNA
sequences, suggesting that the single gene barcode
approach was insufficient to delimit recently diverged
species.
In response to the above criticisms we initiated a more
comprehensive study of 60 sister-species pairs of birds
defined rigorously with multigene phylogenies to
determine whether mtDNA barcodes can reliably distinguish
closely related sister species. Instead of the much criticized
10 rule, which may not apply in rec (...truncated)