The mouse Gene Expression Database (GXD): 2011 update
Jacqueline H. Finger
0
Constance M. Smith
0
Terry F. Hayamizu
0
Ingeborg J. McCright
0
Janan T. Eppig
0
James A. Kadin
0
Joel E. Richardson
0
Martin Ringwald
0
0
The Jackson Laboratory
, 600 Main Street, Bar Harbor,
ME 04609, USA
The Gene Expression Database (GXD) is a community resource of mouse developmental expression information. GXD integrates different types of expression data at the transcript and protein level and captures expression information from many different mouse strains and mutants. GXD places these data in the larger biological context through integration with other Mouse Genome Informatics (MGI) resources and interconnections with many other databases. Web-based query forms support simple or complex searches that take advantage of all these integrated data. The data in GXD are obtained from the literature, from individual laboratories, and from large-scale data providers. All data are annotated and reviewed by GXD curators. Since the last report, the GXD data content has increased significantly, the interface and data displays have been improved, new querying capabilities were implemented, and links to other expression resources were added. GXD is available through the MGI web site (www.informat ics.jax.org), or directly at www.informatics.jax.org/ expression.shtml.
-
As a primary mammalian model of human disease, the
mouse is used extensively for expression studies to
determine the role of genes that function in molecular pathways
during developmental and disease processes. With a focus
on endogenous gene expression during development, the
Gene Expression Database (GXD) collects data from the
scientific literature, from individual laboratories, and from
large-scale data providers. It makes these data readily
available to the research community in a highly curated
and integrated format that allows for a large variety of
database queries. GXD captures a broad spectrum of
assay types, including RNA in situ hybridization,
immunohistochemistry, knock-in reporter assays,
northern blot, western blot, RTPCR, RNase protection
and S1 nuclease assays. It covers all developmental stages
and tissues and includes data from many different mouse
strains and mutants, giving researchers a tool to examine
the effects of mutations on gene expression. GXD forms
an important and integral component of the larger Mouse
Genome Informatics (MGI) resource. Therefore, the
expression data are fully integrated with mouse genetic,
sequence, functional and phenotypic information (14).
MGI maintains further links to many other resources
such as GenBank, gene model resources, Entrez Gene,
UniProt, InterPro, Online Mendelian Inheritance in
Man (OMIM) and the International Knockout Mouse
Consortium (IKMC) among others (514). This robust
integration puts the expression data in GXD into a
much larger biological and analytical context.
Other databases that store mouse expression
information have been developed in recent years. They store data
from one or two specific assay types and/or focus on
specific developmental stages; they are often dedicated to
specific data generation projects (1522). As will be
evident from this article, GXD is working with those
resources that are complementary to GXD, adding value
through data integration and the implementation of new
interconnections. Due to its broad scope, its extensive data
curation and integration efforts, and the resulting
querying capabilities, GXD continues to provide a
unique resource to the biomedical research community.
GXD is updated daily. GXD and its query interfaces
have been described earlier (2327). Here, we report on
our recent progress in terms of data acquisition, and on
the implementation of new query and display features.
NEW GXD HOMEPAGE
To present the objectives of GXD more clearly and to
make the database more intuitive to use, we redesigned
the GXD homepage (www.informatics.jax.org/expression
.shtml). The new layout provides clear access to the
various query forms, with short descriptors for each
form. The Frequently Asked Questions (FAQs) section
provides links to brief on-line tutorials demonstrating how
one can search for different types of data in GXD. The
GXD Includes section provides information about the
current data content in GXD, such as the number of
genes with annotated expression data, the number of
expression results and the number of images in the database.
The Gene Expression News section informs users when
new features, capabilities and data sets become available.
A series of tabs at the bottom of the home page provides
access to help documentation and data policies,
information about GXD and its collaborators, and links to
guidelines and tools that help researchers to submit data
electronically. That GXD is also an integral component
of MGI is made transparent through the use of a central
Quick Search (see below), a common navigation bar and
common drop-menus and tab choices that direct users to
various data sets, search forms, tools and other resources.
Large icons on the MGI homepage provide visual cues to
the various core areas, including expression (GXD).
In the Literature Summary, GXD provides users with a
way to quickly determine what mouse developmental
expression data are available in the scientific literature. The
staff of GXD searches the scientific literature for
publications that present endogenous gene expression
experiments during mouse development. In a first annotation
step for each publication, the genes analyzed, the ages of
mice used in the experiment, and the type of experimental
assay performed for each gene are recorded and entered
into the database. These data are easily searched using the
Gene Expression Literature Query Form. These queries
can also include citation (author, journal, year) and
abstract information. However, this tool takes users
further than a Pubmed search because the data in the
GXD Literature Summary are based on the curation of
the full-text of the paper, including supplemental
information, and annotations are standardized with regard to
gene, age and assay type information. The Literature
Summary is comprehensive and up-to-date. It includes
all journal articles containing expression data during
mouse development from 1993 to the present and all
articles from major developmental journals since 1990.
Currently, the GXD Literature Summary has 108 604
records covering data for 13 619 genes from 17 521
references.
Gene expression data
Beyond summaries, GXD also provides detailed records
of experimental expression results. GXD assay records
contain the authors description of the tissue pattern and
strength of expression, translated into standard
terminology (see below), the probe or antibody information
available, as well as the specimen age, genetic background
and preparation (Figure 1). The expression information is
recorded in standardized formats by making extensive use
of controlled vocabularies and ontologies, thus enabling
data integration and co (...truncated)