MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains

Nucleic Acids Research, Jan 2010

MouseIndelDB is an integrated database resource containing thousands of previously unreported mouse genomic indel (insertion and deletion) polymorphisms ranging from ∼100 nt to 10 Kb in size. The database currently includes polymorphisms identified from our alignment of 26 million whole-genome shotgun sequence traces from four laboratory mouse strains mapped against the reference C57BL/6J genome using GMAP. They can be queried on a local level by chromosomal coordinates, nearby gene names or other genomic feature identifiers, or in bulk format using categories including mouse strain(s), class of polymorphism(s) and chromosome number. The results of such queries are presented either as a custom track on the UCSC mouse genome browser or in tabular format. We anticipate that the MouseIndelDB database will be widely useful for research in mammalian genetics, genomics, and evolutionary biology. Access to the MouseIndelDB database is freely available at: http://variation.osu.edu/.

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/nar/article-pdf/38/suppl_1/D600/16771255/gkp1046.pdf

MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains

D600–D606 Nucleic Acids Research, 2010, Vol. 38, Database issue doi:10.1093/nar/gkp1046 Published online 20 November 2009 MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains Keiko Akagi1,2, Robert M. Stephens3, Jingfeng Li2,4, Evgenji Evdokimov5, Michael R. Kuehn5, Natalia Volfovsky3 and David E. Symer2,4,6,7,8,9,* 1 Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, MD 21702, Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, 3Advanced Biomedical Computing Center, Information Systems Program, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD 21702, 4Basic Research Laboratory, 5Laboratory of Protein Dynamics and Signaling, 6Laboratory of Biochemistry and Molecular Biology, Center for Cancer Research, National Cancer Institute, Frederick, MD 21702, 7Human Cancer Genetics Program, 8Departments of Internal Medicine and 9Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA 2 Received August 5, 2009; Revised October 23, 2009; Accepted October 27, 2009 ABSTRACT MouseIndelDB is an integrated database resource containing thousands of previously unreported mouse genomic indel (insertion and deletion) polymorphisms ranging from 100 nt to 10 Kb in size. The database currently includes polymorphisms identified from our alignment of 26 million whole-genome shotgun sequence traces from four laboratory mouse strains mapped against the reference C57BL/6J genome using GMAP. They can be queried on a local level by chromosomal coordinates, nearby gene names or other genomic feature identifiers, or in bulk format using categories including mouse strain(s), class of polymorphism(s) and chromosome number. The results of such queries are presented either as a custom track on the UCSC mouse genome browser or in tabular format. We anticipate that the MouseIndelDB database will be widely useful for research in mammalian genetics, genomics, and evolutionary biology. Access to the MouseIndelDB database is freely available at: http://variation.osu.edu/. INTRODUCTION An ultimate goal of genetics research is to link phenotypic differences with different genomic variants, and vice versa. Hundreds of distinct mouse strains are characterized by wide-ranging functional differences. This extensive phenotypic variation has helped to make the mouse a premier model organism, mimicking many aspects of human diversity and diseases. Understanding the genomic differences that distinguish different mouse strains and species will improve the usefulness of different mouse lineages as model organisms, facilitate further evolutionary analysis of ancestral relationships for mouse species and strains and shed new light on the genetic basis for variation among human individuals and in human diseases (1,2). Recently, much attention has been given to the types of variation that exist within or between mammalian species (3–5), particularly short variations such as single nucleotide polymorphisms (SNPs) (6,7). Identification and analysis of such variants has been accomplished by many groups, as exemplified by the HapMap project compiling human data (8). These studies have helped to facilitate the recent discovery of genes associated with certain diseases by genome-wide linkage analyses. In addition to SNPs, insertion/deletion (indel) polymorphisms are another important form of variation (9–15). Indels are comprised of blocks of nucleotides that are present in one individual, strain or lineage, but absent at the orthologous locus in another. In addition to being useful in genotyping studies, indel polymorphisms can have direct functional consequences. As they are longer than SNPs, and may introduce or alter promoters, terminators, alternative splice sites and/or other determinants of transcriptional variation (16–19), indel polymorphisms *To whom correspondence should be addressed. Tel: +1 614 292 0885; Fax: +1 614 292 6108; Email: Correspondence may also be addressed to Robert M. Stephens, Tel: +1 301 846 5787; Fax: +1 301 846 5762; Email: Present address: Evgenji Evdokimov, Food and Drug Administration, Department of Health and Human Services, Bethesda, MD, USA The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. ß The Author(s) 2009. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2010, Vol. 38, Database issue could contribute significantly both to differences in gene structure and expression, and to various disease processes. In addition to indel polymorphisms, other important forms of structural variation, including copy number variants and polymorphic segmental duplications, also have been studied extensively (5,20–22). A rich potential source of information about genomic variation exists in unassembled, conventional wholegenome shotgun (WGS) sequence traces obtained from different individuals within or between species. Recently, such traces have been used to identify human SNPs (23,24) and simple tandem repeat (STR) and short indel polymorphisms (10,11,25), as tools to identify such polymorphisms from sequence traces have been developed (26). To identify intermediate length (101–10 000 nt) indels distinguishing between mouse lineages, we recently aligned 26 million WGS traces from four unassembled mouse strains to the C57BL/6J reference genome assembly (19,27). Most such mouse indels of this intermediate length range are made up of repetitive elements. An overwhelming majority of such polymorphisms appears to have resulted from endogenous retrotransposon integration events (19), which is clearly distinct from human indels (12,25,28,29). There are now several genome browsers and databases available which provide data on SNPs, STRs and other forms of variation (23,24,30–33). These browsers are mostly focused on human variants, although other species including mouse have been developed (34). Other databases tabulate forms of structural variation that distinguish human individuals or populations, including polymorphic transposon integrants and other indels in humans, but in some cases lack contextual information about neighboring genomic features (25,35,36). By contrast, MouseIndelDB is an integrated searchable database that presents high-resolution information about indel polymorphisms that distinguish inbred mouse strains. Through their presentation as a custom track on the UCSC mouse genome browser, these mouse indel data now can be visualized easily in the context of many other important and regular (...truncated)


This is a preview of a remote PDF: https://academic.oup.com/nar/article-pdf/38/suppl_1/D600/16771255/gkp1046.pdf
Article home page: https://academic.oup.com/nar/article/38/suppl_1/D600/3112177

Akagi, Keiko, Stephens, Robert M., Li, Jingfeng, Evdokimov, Evgenji, Kuehn, Michael R., Volfovsky, Natalia, Symer, David E.. MouseIndelDB: a database integrating genomic indel polymorphisms that distinguish mouse strains, Nucleic Acids Research, 2010, pp. D600-D606, Volume 38, Issue suppl_1, DOI: 10.1093/nar/gkp1046