The Mouse Gene Expression Database (GXD)
Martin Ringwald
0
Janan T. Eppig
0
Dale A. Begley
0
John P. Corradi
0
Ingeborg J. McCright
0
Terry F. Hayamizu
0
David P. Hill
0
James A. Kadin
0
Joel E. Richardson
0
0
The Jackson Laboratory
, 600 Main Street, Bar Harbor,
ME 04609, USA
The Gene Expression Database (GXD) is a community resource of gene expression information for the laboratory mouse. By combining the different types of expression data, GXD aims to provide increasingly complete information about the expression profiles of genes in different mouse strains and mutants, thus enabling valuable insights into the molecular networks that underlie normal development and disease. GXD is integrated with the Mouse Genome Database (MGD). Extensive interconnections with sequence databases and with databases from other species, and the development and use of shared controlled vocabularies extend GXD's utility for the analysis of gene expression information. GXD is accessible through the Mouse Genome Informatics web site at http://www.informatics.jax.org/ or directly at http://www.informatics.jax.org/menus/ expression_menu.shtml.
-
The laboratory mouse has become a pivotal animal model in
biomedical research because it is closely related to the human
and readily amenable to genetic and molecular analysis.
Tissues from many different mouse strains and mutants, and
from all developmental and adult stages, are easily accessible
for expression analysis. The different methods used to detect
gene products vary in sensitivity and spatial resolution and
contribute distinct and complementary expression information.
The Gene Expression Database (GXD) has been designed as
an open-ended system that can integrate many different types
of expression data, such as RNA in situ hybridization,
immunohistochemistry, northern blot, western blot, RTPCR,
cDNA source and array data (15). Thus, as data accumulate,
GXD can provide increasingly complete information about
what transcripts and proteins are produced by what genes;
where, when and in what amounts these gene products are
expressed; and how their expression varies in different mouse
strains and mutants. Expression patterns reported from assays
with differing spatial resolution are described in standardized
and integrated form using an extensive, hierarchically
structured dictionary of anatomical terms for mouse development
built in collaboration with our Edinburgh colleagues (6).
Digitized images of original expression data are linked to the
respective expression records. GXD is integrated with the
Mouse Genome Database (MGD) (7) to enable a combined
analysis of genotype, expression and phenotype information,
and has comprehensive links to other resources, such as
sequence databases (812), OMIM, MEDLINE and databases
from other species. Such information places gene expression
data in the larger biological and analytical context.
GXD is implemented in the Sybase relational database
management system. It has been available online since July
1998 and has been updated on a daily basis. Access to data is
provided primarily via Web-based query forms. Users
interested in direct SQL access may arrange for an SQL account by
contacting MGI User support (see below). GXD and its WWW
query interface have been described in more detail previously
(3,4). Here, we illustrate recent enhancements of GXD from
the users perspective, by taking the different query forms
provided by GXD as an entry point.
The Gene Expression Data index
Using the GXD index, one can rapidly identify publications
that contain endogenous developmental gene expression
information for specific genes, for particular days of mouse
development, from specific assay types or for any combination
of these parameters. Additional query fields are provided for
bibliographic information (authors, journals, year) and for
words (text strings) that occur in the abstract of the respective
articles. We continue to keep the GXD index up-to-date. All
pertinent journal articles from 1993 to the present and articles
from major developmental journals from 1990 to the present
are indexed. As of September 28, 2000, the index includes
17 401 entries covering 5635 references and expression
information for 3837 genes.
The Gene Expression Data query form
The Gene Expression Data query form provides access to data
from RNA in situ hybridization, immunohistochemistry,
northern blot, western blot, RTPCR and RNase protection
experiments. Using combinations of search parameters, one
can ask increasingly complex expression queries, such as: In
what anatomical structures and/or at what developmental
stages has a specified gene or a specified set of genes been
detected/not detected? or What genes have been detected/not
detected in a given tissue and/or at a particular time of
development using a specified set of expression assays? Due
to the hierarchical structure of the anatomical dictionary,
spatial queries can include anatomical substructures or
superstructures. Further, it is possible to correlate gene expression with
chromosomal location, a query particularly relevant for hunting
candidate genes. The query capabilities for chromosomal
location have been refined. It is now possible to search for
genes located on a particular chromosome, between specified
loci or within a specified distance from a genetic locus. The most
significant enhancement of the Gene Expression Data query form
is that one can now also search for genes whose products perform
a particular molecular function (e.g. transcription factor or
DNA helicase); are involved in a specified biological
process (e.g. apoptosis or purine metabolism); or belong to
a defined cellular component (e.g. nucleus or origin
recognition complex). As a member of the Gene Ontology Project
(13), we participate in building shared controlled vocabularies
for these three categories and in assigning pertinent terms to
genes in our database. The addition of these search parameters
enables important new queries such as What transcription
factors are expressed in the diencephalon from day 11.515 of
mouse development? or What genes involved in apoptosis
have been detected in the limb? and enhances the utility of the
expression data stored in GXD.
The Mouse Anatomical Dictionary Browser
The Mouse Anatomical Dictionary Browser has been added as a
new tool to navigate through the extensive dictionary hierarchies for
the different developmental stages, to locate specific anatomical
structures in the hierarchies and to look up expression results
associated with those structures (Fig. 1). Thus, while the Gene
Expression Data query form described above enables powerful
combinatorial queries (including anatomical structures), the
anatomy browser lets users view gene expression data directly
from the developmental anatomy perspective. It must be noted
that Theiler stages 2326 have been added only recently and
are still under development. So far, only limited expression
data have been entered for these stages of mouse developm (...truncated)