Fine-Scale Phylogenetic Discordance across the House Mouse Genome
Citation: White MA, Ane C, Dewey CN, Larget BR, Payseur BA (
Fine-Scale Phylogenetic Discordance across the House Mouse Genome
Michael A. White 0
Ce cile Ane 0
Colin N. Dewey 0
Bret R. Larget 0
Bret A. Payseur 0
Mikkel H. Schierup, University of Aarhus, Denmark
0 1 Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin, United States of America, 2 Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America, 3 Department of Botany, University of Wisconsin, Madison, Wisconsin, United States of America, 4 Department of Biostatistics, University of Wisconsin, Madison, Wisconsin, United States of America, 5 Department of Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America, 6 Department of Computer Sciences, University of Wisconsin , Madison, Wisconsin , United States of America
Population genetic theory predicts discordance in the true phylogeny of different genomic regions when studying recently diverged species. Despite this expectation, genome-wide discordance in young species groups has rarely been statistically quantified. The house mouse subspecies group provides a model system for examining phylogenetic discordance. House mouse subspecies are recently derived, suggesting that even if there has been a simple tree-like population history, gene trees could disagree with the population history due to incomplete lineage sorting. Subspecies of house mice also hybridize in nature, raising the possibility that recent introgression might lead to additional phylogenetic discordance. Single-locus approaches have revealed support for conflicting topologies, resulting in a subspecies tree often summarized as a polytomy. To analyze phylogenetic histories on a genomic scale, we applied a recently developed method, Bayesian concordance analysis, to dense SNP data from three closely related subspecies of house mice: Mus musculus musculus, M. m. castaneus, and M. m. domesticus. We documented substantial variation in phylogenetic history across the genome. Although each of the three possible topologies was strongly supported by a large number of loci, there was statistical evidence for a primary phylogenetic history in which M. m. musculus and M. m. castaneus are sister subspecies. These results underscore the importance of measuring phylogenetic discordance in other recently diverged groups using methods such as Bayesian concordance analysis, which are designed for this purpose.
-
Funding: MAW was supported by an NLM graduate training grant (NLM 2T15LM007359) to the Computation and Informatics in Biology and Medicine Training
Program at the University of Wisconsin. The research was supported by an NSF grant (DEB 0918000) to BAP. The funders had no role in study design, data
collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
With the advent of new sequencing technologies, the
reconstruction of phylogenetic histories on the genomic scale has
become feasible. Genomic data offer the potential to resolve
phylogenies that have been difficult to reconstruct from a small
number of genes [15]. Although highly resolved phylogenies can
sometimes be recovered when data sets are concatenated, such
total evidence trees may depart from the history of population
branching, the species history [6,7]. The measurement and
incorporation of gene genealogical discordance into genomic
analyses is expected to improve inferences about species history,
particularly for recently derived groups [8].
Topological discordance among gene trees is expected under
several scenarios [9]. Population subdivision and asymmetric gene
flow among ancestral populations [10], as well as introgression
between diverged populations, can generate widespread
discordance. Ancestral polymorphisms can also segregate, causing some
gene trees to disagree with the population tree. The effects of this
incomplete lineage sorting are greatest when effective population
sizes are high and internodes of the population tree are of short
duration [1116]. Consistent with these predictions, substantial
phylogenetic discordance has been documented on the genomic
scale in a few young species groups. Pollard et al. [17]
demonstrated significant variation among 9,405 genes in Drosophila
erecta, D. melanogaster, and D. yakuba. In addition, genomic
discordance has been repeatedly observed in analyses of humans,
chimpanzees, and gorillas, with a majority of gene trees supporting
a human/chimpanzee sister relationship [1825]. Although it is
well established that closely related lineages will exhibit substantial
genealogical discordance, few studies have quantified phylogenetic
discordance across entire genomes (including non-coding regions).
Consequently, the extent of variation on this scale remains poorly
understood.
The house mouse subspecies group (Mus musculus musculus, M. m.
castaneus, and M. m. domesticus) provides an excellent system for
exploring genome-wide patterns of phylogenetic discordance
because (i) sources of potential discordance (incomplete lineage
sorting and introgression) exist and (ii) almost complete genome
sequences are available. The earliest divergences in the house
mouse subspecies group occurred only 500,000 generations ago
(assuming 1 generation per year) [2630] and house mice are
estimated to have large effective population sizes (approximately
105) [30,31], suggesting an important role for incomplete lineage
sorting. In addition, the extent of interspecific gene flow varies
across the genome and among the three subspecies [30,3238].
The phylogenetic history of individual genes can differ
strongly from the species history if taxa are recently
derived, making inferences of a species history from only a
handful of genes especially difficult in these cases.
Genome-scale data sets now allow phylogenetic histories
to be reconstructed from a large number of genes.
Although data sets of this size are becoming more
common, few studies have characterized variation in
phylogenetic history across whole genomes. We
summarize fine scale variation in phylogenetic history across the
genome of house mice, a recently derived group of
subspecies, using a method that combines phylogenetic
uncertainty among gene trees. We document substantial
variation in phylogenetic history among 14,081 loci and
describe a primary history in the face of this variation.
These results support the use of genome-scale datasets
and methods that accommodate phylogenetic
discordance in attempts to reconstruct the history of closely
related groups.
Two of the subspecies (M. m. domesticus and M. m. musculus) meet in
a stable hybrid zone, in which dramatic variation in introgression
among genomic regions has been documented [34,36,38,39]. The
other two subspecies pairs (M. m. castaneus/M. m. domesticus and M.
m. musculus/M. m. castaneus) also exchange genes in (...truncated)