MGD: the Mouse Genome Database
Judith A. Blake
0
Joel E. Richardson
0
Carol J. Bult
0
Jim A. Kadin
0
Janan T. Eppigand The Mouse Genome Database Group
0
Current members of the Mouse Genome Database Group are R. M. Baldarelli
0
J. S. Beal
0
D. W. Bradt
0
D. L. Burkart
0
N. E. Butler
0
J. Campbell
0
T. Chu
0
L. E. Corbani
0
S. Cousins
0
H. J. Drabkin
0
D. Dahmen
0
K. Frazer
0
D. M. Garippa
0
C. W. Goldsmith
0
P. L. Grant
0
M. Lennon-Pierce
0
J. Lewis
0
I. Lu
0
C. M. Lutz
0
L. J. Maltais
0
P. Mani
0
L. M. McKenzie
0
L. Ni
0
J. E. Ormsby
0
A. Planchart
0
S. Ramachandran
0
D. J. Reed
0
D. R. Shaw
0
C. L. Smith
0
P. Szauter
0
P. Vanden Borre
0
L. Washburn
0
J. Winslow
0
0
The Jackson Laboratory
, 600 Main Street, Bar Harbor,
ME 04609, USA
The Mouse Genome Database (MGD) (http://www. informatics.jax.org) one component of a community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understandinghuman biology. MGD strives to provide an extensively integrated information resource with experimental details annotated from both literature and on-line genomic data sources. MGD curates and presents the consensus representation of genotype (sequence) to phenotype information includinghihgly detailed information about genes and gene products. Primary foci of integration are through representations of relationships between genes, sequences and phenotypes. MGD collaborates with other bioinformatics groups to curate a definitive set of information about the laboratory mouse. Recent developments include a general implementation of database structures for controlled vocabularies and the integration of a phenotype classification system.
-
The Mouse Genome Database (MGD) provides an integrated
view of genetic and genomic information for the laboratory
mouse (1). MGD contains information on mouse genes,
genetic markers and genomic features as well as information
on molecular segments ( probes, primers, cDNA clones,
BACs and YACs) mutant phenotypes, comparative mapping
data, graphical displays of linkage, cytogenetic and physical
maps, experimental mapping data, as well as strain
distribution patterns for recombinant inbred strains (RIs) and cross
haplotypes. MGD is updated daily (Table 1). Since it first
became available on the WWW, MGD has continued to
evolve, expanding its data coverage, improving data handling,
and providing several new data manipulation and display
tools.
MGD data statistics
Number of references
Number of genes
Number of markers (including genes)
Number of genes with sequence data
Number of markers mapped
Number of mouse/human curator orthologies
Number of genes with links to SWISS-PROT
Number of genes with GO annotations
Number of genes with annotated alleles
Number of annotated alleles
Number of mouse nucleotide sequences
curated and integrated in the MGI
system (includes ESTs)
MGD is one component of the Mouse Genome Informatics
(MGI) database resource (http://www.informatics.jax.org)
located at The Jackson Laboratory (http://www.jax.org).
Other projects and resources that contribute to MGI include
the Gene Expression Database (GXD) (2), the Mouse Genome
Sequencing (MGS) (3) project and the Mouse Tumor Biology
Database (MTB; http://www.informatics.jax.org/mtb) (4). The
MGI consortium group participates actively in the
development and implementation of the Gene Ontologies (GO)
(www.geneontology.org) (5). MGI curators also collaborate
extensively with SWISS-PROT (6) and with the LocusLink
project at NCBI (7) to evaluate associations between genes and
sequences for the mouse.
IMPROVEMENTS DURING 2002
Implementation of phenotype classifications
A broad, high-level set of phenotype terms have been
developed and employed to classify phenotype data in
MGD. This defined vocabulary of 105 terms can be used
to search, group, compare and analyze phenotypes. These
phenotype classification terms appear on the Alleles and
Phenotypes Query Form (Fig. 1), and on the Genes and
Marker Query Form. The complete list of terms and their
accession IDs is also available by FTP. On each form, there is
a link to the phenotype classification terms, complete with
definitions and examples. Users of the MGI database can
select one or more terms from the list to search for records
associated with a particular phenotype, in combination
with many other parameters on the forms. In addition,
textbased searches for more specific phenotypic terms remain
available.
A more comprehensive phenotype vocabulary continues to
be developed by MGD staff and currently (September, 2002)
contains over 1800 concepts. These terms are used to
annotate mouse mutant phenotypes. Although these
controlled terms are used to annotate mouse mutant phenotypes
and can be viewed on allele detail pages, there currently is
limited access to the full phenotype vocabulary as a query or
analysis tool.
Improvements to the MGI : GO browser
The MGI GO Browser (http://www.informatics.jax.org/searches/
GO_form.shtml) allows database users to access genes in MGI
using functional annotation terms from the GO. This Browser
was developed in conjunction with the GXD. A general
database implementation within MGI for structured, controlled
vocabularies enhances the search and recovery capabilities of
this browser. The GO Browser can be accessed from gene
detail or query pages as well as directly from the MGI menus.
A GO Browser query returns a graph reflecting both parents
and children of the query term and a link to all MGI
associations with that term or any of the subterms.
Availability of MGI : GO files in various formats
MGI gene-to-GO annotations are updated daily. Various files
for the MGI gene/markers with the GO associations are
publicly available. These files are updated each time MGI
submits a new gene association file to the GO web site (http://
www.geneontology.org) and can be accessed on the MGI FTP
server (ftp://www.informatics.jax.org/pub/informatics/reports/
gene_association.mgi). A file of all the GO terms used by
MGI in the annotation of genes and gene products is also
available. MGI also provides a file to the GO database of MGI
Gene : SWISS-PROT associations. This information is
incorporated into the GO database and thus enables users to recover
mouse sequence data as a result of a semantic search against
the GO database (http://www.godatabase.org/cgi-bin/go.cgi).
OTHER INFORMATION
MGD encourages user input into its gene and allele annotation
efforts. On each gene detail and allele detail page, a clickable
button (Your Input Welcome) brings the user to a web-based
form for submitting updates to the information being viewed.
Mouse gene nomenclature
The MGD gene annotation group assigns unique symbols and
names to mouse genes under the guidelines set by the
International Committee on Standardized Genetic Nomenclature
for mouse (http://www.informatics.jax.org/mgihome/nomen/
index.shtml#mnrg) (8). Scientists can reserve symbols prior
to publication using the electronic nomenclature submission
for (...truncated)