The Mouse Genome Database: enhancements and updates

Nucleic Acids Research, Jan 2010

The Mouse Genome Database (MGD) is a major component of the Mouse Genome Informatics (MGI, http://www.informatics.jax.org/) database resource and serves as the primary community model organism database for the laboratory mouse. MGD is the authoritative source for mouse gene, allele and strain nomenclature and for phenotype and functional annotations of mouse genes. MGD contains comprehensive data and information related to mouse genes and their functions, standardized descriptions of mouse phenotypes, extensive integration of DNA and protein sequence data, normalized representation of genome and genome variant information including comparative data on mammalian genes. Data for MGD are obtained from diverse sources including manual curation of the biomedical literature and direct contributions from individual investigator's laboratories and major informatics resource centers, such as Ensembl, UniProt and NCBI. MGD collaborates with the bioinformatics community on the development and use of biomedical ontologies such as the Gene Ontology and the Mammalian Phenotype Ontology. Recent improvements in MGD described here includes integration of mouse gene trap allele and sequence data, integration of gene targeting information from the International Knockout Mouse Consortium, deployment of an MGI Biomart, and enhancements to our batch query capability for customized data access and retrieval.

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/nar/article-pdf/38/suppl_1/D586/11217843/gkp880.pdf

The Mouse Genome Database: enhancements and updates

D586–D592 Nucleic Acids Research, 2010, Vol. 38, Database issue doi:10.1093/nar/gkp880 Published online 27 October 2009 The Mouse Genome Database: enhancements and updates Carol J. Bult*, James A. Kadin, Joel E. Richardson, Judith A. Blake and Janan T. Eppig and the Mouse Genome Database Groupy The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609 USA Received September 15, 2009; Accepted October 1, 2009 ABSTRACT INTRODUCTION The Mouse Genome Database (MGD) is a major component of the Mouse Genome Informatics (MGI, http://www.informatics.jax.org/) database resource and serves as the primary community model organism database for the laboratory mouse. MGD is the authoritative source for mouse gene, allele and strain nomenclature and for phenotype and functional annotations of mouse genes. MGD contains comprehensive data and information related to mouse genes and their functions, standardized descriptions of mouse phenotypes, extensive integration of DNA and protein sequence data, normalized representation of genome and genome variant information including comparative data on mammalian genes. Data for MGD are obtained from diverse sources including manual curation of the biomedical literature and direct contributions from individual investigator’s laboratories and major informatics resource centers, such as Ensembl, UniProt and NCBI. MGD collaborates with the bioinformatics community on the development and use of biomedical ontologies such as the Gene Ontology and the Mammalian Phenotype Ontology. Recent improvements in MGD described here includes integration of mouse gene trap allele and sequence data, integration of gene targeting information from the International Knockout Mouse Consortium, deployment of an MGI Biomart, and enhancements to our batch query capability for customized data access and retrieval. The Mouse Genome Database (MGD) is an integrated database of genetic, genomic and phenotypic data for the laboratory mouse (1–3). MGD is a central component of the Mouse Genome Informatics (MGI) database resource (http://www.informatics.jax.org), the community model organism database for the laboratory mouse. Other MGI data resources integrated with MGD includes the Gene Expression Database (GXD) (4), the Mouse Tumor Biology Database (MTB) (5), the Gene Ontology (GO) project (6) and the MouseCyc database of biochemical pathways (7). Data in MGD are updated daily. There are typically four to six major software releases per year to support access and display of new data types. The primary data types maintained in MGD include mouse genes and other genome features along with their function and phenotype annotations, associations of genome features with nucleotide and protein sequences, genetic and physical maps, gene families, mutant phenotypes, SNPs and other polymorphisms animal models of human disease, and mammalian homology. A recent summary of MGD content is shown in Table 1. MGD is the authoritative source for mouse gene, allele and strain nomenclature, Gene Ontology annotations for mouse gene function, and Mammalian Phenotype (MP) Ontology (8) annotations for phenotype associations. MGD contains the most comprehensive source of mouse phenotype information and associations between human diseases and mouse models. MGI curatorial staff acquire data by direct data loads from other databases, from direct submission from researchers and from published literature. To facilitate data integration, MGI employs recognized standards for genetic nomenclature and functional annotation to describe mouse sequence data, genes, *To whom correspondence should be addressed. Tel: +1 207 288 6248; Fax: +1 207 288 6132; Email: y The Mouse Genome Database Group: M. T. Airey, A. Anagnostopoulos, R. Babiuk, R. M. Baldarelli, M. Baya, J. S. Beal, S. M. Bello, D. W. Bradt, D. L. Burkart, N. E. Butler, J. Campbell, L. E. Corbani, S. L. Cousins, D. J. Dahmen, H. Dene, A. D. Diehl, M. E. Dolan, K. L. Forthofer, K. S. Frazer, P. Frost, D. E. Geel, M. Hall, M. Knowlton, J. R. Lewis, L. J. Maltais, M. McAndrews-Hill, S. McClatchy, M. J. McCrossin, J. Mason, T. F. Meehan, D. B. Miers, L. A. Miller, L. Ni, H. Onda, J. E. Ormsby, D. J. Reed, B. Richards-Smith, D. R. Shaw, R. Sinclair, D. Sitnikov, C. L. Smith, P. Szauter, M. Tomczuk, L. L. Washburn, I. T. Witham, Y. Zhu. ß The Author(s) 2009. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2010, Vol. 38, Database issue Table 1. Summary of MGD data content (10 September 2009) NEW IN 2009 MGD data statistics Completing the representation of Mouse Gene Traps 10 September 2009 Genes with nucleotide sequence data 28 891 Genes with protein sequence data 26 255 Genes (including uncloned mutations) 36 323 Genes with GO annotations 18 167 Mouse/human orthologs 17 787 Mouse/rat orthologs 16 768 17 227 Genes with one or more mutant allelesa 8363 Genes with one or more phenotypic allelesb a 524 527 Total mutant alleles 22 666 Phenotypic allelesb Targeted alleles 13 721 Gene trapped alleles 501 232 Human diseases with one or more mouse models 964 QTLs 4248 Number of references 146 597 Mouse RefSNPs 10 089 692 a Mutant alleles include those occurring in mice and/or in ES cell lines. Phenotypic alleles include only those mutant alleles present in mice. b strains, expression data, alleles and phenotypes. All data associations in MGD are supported with evidence and citations. Researchers can query MGD using keyword searches, vocabulary browsers and advanced web-based query forms. Keyword search supports the use of the wildcard characters (i.e.*) for broad searches and the use of quotation marks for specific phrases search. MGD also provides vocabulary browsers for GO annotations, MP annotations and Human Disease Term annotations to support browsing of the database content. The webbased query forms in MGD allow, users to construct queries of differing degrees of specificity. For example, using the Genes and Markers Query form in MGD, a researcher query broadly for all genes on mouse Chromosome 3 or specifically for genes on Chromosome 3 that are associated with specific phenotypes and/or functions (i.e. show me all genes on mouse Chromosome 3 that are associated with respiratory distress and that have been annotated functionally as being enzymes). The MGI MouseBLAST server allows users to interrogate the MGI database using nucleotide and/or protein sequences. Access to data in MGD is also facilitated by summary data files that are updated nightlyand available for download via FTP, and through direct SQL (Structured Query Language; user account is required). The staff of MGD collaborates with members of other large genome inform (...truncated)


This is a preview of a remote PDF: https://academic.oup.com/nar/article-pdf/38/suppl_1/D586/11217843/gkp880.pdf
Article home page: https://academic.oup.com/nar/article/38/suppl_1/D586/3112235

Bult, Carol J., Kadin, James A., Richardson, Joel E., Blake, Judith A., Eppig, Janan T. The Mouse Genome Database: enhancements and updates, Nucleic Acids Research, 2010, pp. D586-D592, Volume 38, Issue suppl_1, DOI: 10.1093/nar/gkp880