GXD: a community resource of mouse Gene Expression Data
GXD: a community resource of mouse Gene Expression Data
Constance M. Smith 0
Jacqueline H. Finger 0
Terry F. Hayamizu 0
Ingeborg J. McCright 0
Jingxia Xu 0
Janan T. Eppig 0
James A. Kadin 0
Joel E. Richardson 0
Martin Ringwald 0
0 The Jackson Laboratory , Bar Harbor, ME 04609 , USA
The Gene Expression Database (GXD) is an extensive, easily searchable, and freely available database of mouse gene expression information (www.informatics. jax.org/expression.shtml). GXD was developed to foster progress toward understanding the molecular basis of human development and disease. GXD contains information about when and where genes are expressed in different tissues in the mouse, especially during the embryonic period. GXD collects different types of expression data from wild-type and mutant mice, including RNA in situ hybridization, immunohistochemistry, RT-PCR, and northern and western blot results. The GXD curators read the scientific literature and enter the expression data from those papers into the database. GXD also acquires expression data directly from researchers, including groups doing large-scale expression studies. GXD currently contains nearly 1.5 million expression results for over 13,900 genes. In addition, it has over 265,000 images of expression data, allowing users to retrieve the primary data and interpret it themselves. By being an integral part of the larger Mouse Genome Informatics (MGI) resource, GXD's expression data are combined with other genetic, functional, phenotypic, and disease-oriented data. This allows GXD to provide tools for researchers to evaluate expression data in the larger context, search by a wide variety of biologically and biomedically relevant parameters, and discover new data connections to help in the design of new experiments. Thus, GXD can provide researchers with critical insights into the functions of genes and the molecular mechanisms of development, differentiation, and disease.
-
Recent technological advances have made it possible to
rapidly determine the sequences of individual human
genomes and to correlate genetic mutations with human
diseases. Evolutionarily closely related to humans, the
mouse is a pivotal model system for determining the
molecular mechanisms that lead from specific mutations to
developmental defects and disease phenotypes. In mouse,
specific constitutive and conditional mutants can be easily
generated, and tissues from many different strains and
mutants, as well as all developmental stages, can be
obtained for gene expression analyses. These expression data
can then be correlated with phenotypic and disease data to
gain insights into the function of genes and the molecular
mechanisms that underlie human development,
differentiation, and disease.
The objective of the Gene Expression Database (GXD) is
to support and facilitate the studies of the molecular
mechanisms that underlie developmental and disease
processes. GXD systematically collects and integrates different
types of expression data from wild-type and mutant mice
through curation of the published literature and by
collaboration with large-scale projects and makes them
available to researchers in an extensive and easily searchable
database (Finger et al. 2011; Smith et al. 2014a). Further, as
an integral component of the larger Mouse Genome
Informatics (MGI) resource, GXD combines its data with all the
other genetic, genomic, function, phenotypic, and
diseaserelated information in MGI, thus placing these expression
data in context and making them readily accessible to many
types of biologically and biomedically relevant database
searches (Eppig et al. 2015; Smith et al. 2014b).
The importance of recording and integrating mouse
expression data and placing them in a larger biological
context cannot be overstated. It is impossible for any single
individual to keep abreast of all the biomedical research
data that are generated yearly, let alone to memorize all
these data and their connections. The ability to find results
of previous experiments quickly can save investigators
months of research time, both in the library and in the
laboratory. In addition, GXD and MGI enable researchers
to discover new data connections, thus allowing them
to develop scientific hypotheses and to design new
experiments.
In the following paragraphs, we will discuss: the
contents of GXD; how and why expression data are recorded
in standardized ways; the integration of expression data
with other data in MGI; and the tools provided by GXD to
explore these data.
GXD collects endogenous gene expression information
derived from wild-type and mutant mice. It includes data
from all stages of development, including postnatal
development, although the main emphasis is gene expression
during the embryonic period. GXD provides researchers a
comprehensive survey of the embryonic expression
literature, detailed expression data, and tools to examine
these data. Because different types of expression assays
yield different information about gene products at the RNA
and protein level, GXD has been designed as a system that
can integrate multiple types of expression data (Ringwald
et al. 1994). GXDs emphasis has been, and continues to
be, on data from RNA in situ, immunohistochemistry,
knock-in reporter, RT-PCR, northern blot, and western blot
experiments. Links to array and high-throughput
sequencing expression data at NCBI GEO (Barrett et al. 2013) and
the Expression Atlas at EMBL-EBI (Petryszak et al. 2014)
are provided as well, and closer integration of these data
within GXD is planned for the future.
GXDs data content and acquisition efforts are unique,
integrating heterogeneous expression data from disparate
sources. GXD is the only database that systematically
curates mouse developmental expression data from the
literature. The GXD curators have read and entered the
results from thousands of published papers into GXD.
Additional data are acquired via electronic data
submissions and through collaborations with large-scale data
providers. The large-scale projects whose data are in GXD
include: GenePaint (Visel et al. 2004), Eurexpress
(DiezRoux et al. 2011), the Brain Gene Expression Map
(BGEM; Magdaleno et al. 2006), and the GenitoUrinary
Development Molecular Anatomy Project (GUDMAP;
Harding et al. 2011). Thus, the data in GXD represent the
results of research performed by small- and large-scale
laboratories worldwide. GXD currently contains detailed
expression results from almost 70,000 experiments and
data for nearly 1.5 million expression results examining the
expression of approximately 13,900 genes. This includes
data from more than 2100 mouse mutants. In addition, it
has over 265,000 images of the original data, allowing
researchers to view and interpret the experiments
themselves. Eighty-two percent of the data are from RNA in situ
hybridization studies and 10 % from RT-PCR experiments,
reflecting the detailed spatial resolution and sensitivity
required in developme (...truncated)