Chromosome 1 Sequence Analysis of C57BL/6J-Chr1KM Mouse Strain
Hindawi
International Journal of Genomics
Volume 2017, Article ID 1712530, 9 pages
https://doi.org/10.1155/2017/1712530
Research Article
Chromosome 1 Sequence Analysis of C57BL/6J-Chr1KM
Mouse Strain
Fuyi Xu,1 Tianzhu Chao,1 Yiyin Zhang,1 Shixian Hu,1 Yuxun Zhou,1 Hongyan Xu,2
Junhua Xiao,1 and Kai Li1
1
2
College of Chemistry, Chemical Engineering, and Biotechnology, Donghua University, Shanghai, China
Department of Biostatistics and Epidemiology, Medical College of Georgia, Augusta University, Augusta, GA, USA
Correspondence should be addressed to Junhua Xiao; and Kai Li;
Received 15 December 2016; Revised 9 February 2017; Accepted 15 February 2017; Published 9 April 2017
Academic Editor: Leng Han
Copyright © 2017 Fuyi Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The Chinese Kunming (KM) mouse is a widely used outbred mouse stock in China. However, its genetic structure remains unclear.
In this study, we sequenced the genome of the C57BL/6J-Chr1KM (B6-Chr1KM) strain, the chromosome 1 (Chr 1) of which was
derived from one KM mouse. With 36.6× average coverage of the entire genome, 0.48 million single nucleotide polymorphisms
(SNPs) and 96,679 indels were detected on Chr 1 through comparison with reference strain C57BL/6J. Moreover, 46,590 of
them were classified as novel mutations. Further functional annotation identified 155 genes harboring potentially functional
variants, among which 27 genes have been associated with human diseases. We then performed sequence similarity and Bayesian
concordance analysis using the SNPs identified on Chr 1 and their counterparts in three subspecies, Mus musculus domesticus,
M. m. musculus, and M. m. castaneus. Both analyses suggested that the Chr 1 sequence of B6-Chr1KM was predominantly
derived from M. m. domesticus while 9.7% of the sequence was found to be from M. m. musculus. In conclusion, our analysis
provided a detailed description of the genetic variations on Chr 1 of B6-Chr1KM and a new perspective on the subspecies origin
of KM mouse which can be used to guide further genetic studies with this mouse strain.
1. Introduction
The Chinese Kunming (KM) mouse colony, the largest
outbred mouse stock maintained by commercial dealers
nationwide in China, has been widely used in pharmaceutical
and genetic studies [1]. Unlike other outbred mice, KM
mouse has a complex evolutionary history. In 1944 during
the World War II, Swiss mice were initially introduced into
Kunming, Yunnan Province, China, from the Indian Haffkine
Institute by Professor Feifan Tang via the Hump route with the
help of the American Volunteer Group [2]. These mice were
named KM mice after their initial location in China. Because
most other mouse strains were lost and mouse facilities
were damaged during the World War II, KM mouse became
the only laboratory mouse available afterwards. They were
gradually distributed throughout most of the country
for medical studies. However, despite the importance
of this outbred mouse, its underlying genetic structure
remains unclear.
According to the Mouse Genome Informatics (http://
www.informatics.jax.org/), over one thousand quantitative
trait loci (QTLs) have been mapped on mouse chromosome
1 (hereafter referred to as Chr 1) including large amounts of
QTLs related to metabolism disorder. However, very few
candidate genes have been identified partly because of
the large QTL intervals. In order to fine map the metabolism
disorder QTLs on Chr 1 and identify the candidate genes, we
established a population of Chr 1 substitution mouse
strains, in which C57BL6/J (B6) was the host strain, and
one KM mouse, five inbred strains, and twenty-four wild
mice captured from various locations in China were
selected as the Chr 1 donors [3]. In order to dissect the
genetic structure and variations of this population and better
severe further genetic studies, we have resequenced 18 strains
2
of this population including C57BL/6J-Chr1KM (B6-Chr1KM)
with next-generation sequencing technology [4].
In this study, we analyzed the genome sequence data
from B6-Chr1KM strain and identified 0.48 million single
nucleotide polymorphisms (SNPs) and 96,679 indels on
Chr 1, of which 6.4% SNPs and 16.3% indels were considered
to be novel. Functional annotation suggested that 474 variants had deleterious effect on gene functions. In addition,
we explored the KM mouse genetic structure by performing
sequence similarity and Bayesian concordance analysis
(BCA) on Chr 1. Results suggested that KM mouse was
predominately originated from Mus musculus domesticus
and part of the sequence was from M. m. musculus.
2. Materials and Methods
2.1. Animals. B6 and KM mice were purchased from Shanghai
SLAC Laboratory Animal Co., Ltd., China. One male KM
mouse was mated with female B6 to produce hybrid F1,
followed by 8 generations of backcrossing with B6 using
marker-assisted selection, then brother × sister mating to
create a B6-Chr1KM Chr 1 substitution strain [3]. All mice
were maintained under specific pathogen-free (SPF)
conditions according to the People’s Republic of China
Laboratory Animal Regulations, and the study was conducted in accordance with the recommendations of and
was approved by the Laboratory Animal Committee of
Donghua University.
2.2. DNA Sequencing. B6-Chr1KM genomic DNA was extracted
from tail tissue of a male mouse using an AxyPrep™ Multisource Genomic DNA Miniprep Kit (Axygen, Hangzhou,
China) according to the manufacturer’s protocol.
Purified genomic DNA was sheared and size selected
(300–500 bp). Paired-end sequencing (2 × 125 bp) was carried
out with an Illumina HiSeq 2500 instrument (Illumina
Inc., San Diego, CA, USA) on two lanes by WuXi AppTec
(Shanghai, China) according to the manufacturer’s protocol.
2.3. Read Alignment. Raw reads were filtered using NGS QC
toolkit v2.3 [5] to remove reads containing more than 30%
low-quality (Q20) bases. Filtered reads were aligned to the
C57BL/6J reference genome (December 2011 release of the
mouse reference genome (mm10) from Ensembl) using
BWA (version 0.7.10-r789) with 12 threads [6]. The resulting
SAM file was converted to a binary format and sorted with
SAMtools v1.1 [7], followed by the marking of duplicate reads
using picard-tools v1.119 (http://picard.sourceforge.net). To
improve SNP and indel calling, indel realignment was conducted with Genome Analysis Toolkit (GATK v3.3) [8].
2.4. SNP/Indel Identification and Annotation. SNPs and
indels were called using SAMtools mpileup and BCFtools call
functions [7], with the '-uf' and '-cv' parameters, respectively.
To identify a high-quality variant data set, variants were
filtered using the BCFtools filter and VCFtools varFilter
function [9]. The following parameters were used: for
BCFtools filter, '-g 10 -G 3 -i 'QUAL>10 && MIN(MQ)>25
&& MIN(DP)>6 && MAX(DP)<199 && (DP4[2]+DP4[3])
> 2 (...truncated)