The mouse Gene Expression Database (GXD): 2007 update

Nucleic Acids Research, Jan 2007

The Gene Expression Database (GXD) provides the scientific community with an extensive and easily searchable database of gene expression information about the mouse. Its primary emphasis is on developmental studies. By integrating different types of expression data, GXD aims to provide comprehensive information about expression patterns of transcripts and proteins in wild-type and mutant mice. Integration with the other Mouse Genome Informatics (MGI) databases places the gene expression information in the context of genetic, sequence, functional and phenotypic information, enabling valuable insights into the molecular biology that underlies developmental and disease processes. In recent years the utility of GXD has been greatly enhanced by a large increase in data content, obtained from the literature and provided by researchers doing large-scale in situ and cDNA screens. In addition, we have continued to refine our query and display features to make it easier for users to interrogate the data. GXD is available through the MGI web site at http://www.informatics.jax.org/ or directly at http://www.informatics.jax.org/menus/expression_menu.shtml.

Article PDF cannot be displayed. You can download it here:

https://nar.oxfordjournals.org/content/35/suppl_1/D618.full.pdf

The mouse Gene Expression Database (GXD): 2007 update

Constance M. Smith 0 Jacqueline H. Finger 0 Terry F. Hayamizu 0 Ingeborg J. McCright 0 Janan T. Eppig 0 James A. Kadin 0 Joel E. Richardson 0 Martin Ringwald 0 0 The Jackson Laboratory , 600 Main Street, Bar Harbor, ME 04609, USA The Gene Expression Database (GXD) provides the scientific community with an extensive and easily searchable database of gene expression information about the mouse. Its primary emphasis is on developmental studies. By integrating different types of expression data, GXD aims to provide comprehensive information about expression patterns of transcripts and proteins in wild-type and mutant mice. Integration with the other Mouse Genome Informatics (MGI) databases places the gene expression information in the context of genetic, sequence, functional and phenotypic information, enabling valuable insights into the molecular biology that underlies developmental and disease processes. In recent years the utility of GXD has been greatly enhanced by a large increase in data content, obtained from the literature and provided by researchers doing large-scale in situ and cDNA screens. In addition, we have continued to refine our query and display features to make it easier for users to interrogate the data. GXD is available through the MGI web site at http://www.informatics.jax.org/ or directly at http://www.informatics.jax.org/menus/ expression_menu.shtml. - The laboratory mouse serves as a premier animal model in studying the complex molecular networks that underlie the processes of human development, differentiation and disease. To gain insights into these networks, it is essential to know where, when and in what amounts transcripts and proteins are expressed, and how their expression varies in different mouse strains and mutants. The Gene Expression Database (GXD) addresses this objective in a uniquely comprehensive way. GXD is the only resource that acquires mouse expression data from the literature in a systematic manner, as well as acquiring data directly from conventional and large-scale providers via electronic data submission and bulk data downloads. GXD integrates various types of mRNA and protein expression information, collects data from all tissue and developmental stages and includes data from many different mouse strains and mutants. Annotations in GXD make extensive use of controlled vocabularies and ontologies to provide the standardization of data that enables complex queries. In addition, GXD is fully integrated with the other databases of the Mouse Genome Informatics (MGI) resource, including the Mouse Genome Database (MGD) (1,2) and the MGI part of the Gene Ontology Project (GO) (3). MGI also maintains comprehensive links to external resources such as sequence databases, Entrez Gene, UniProt, InterPro, Online Mendelian Inheritance in Man (OMIM), PubMed and other mammalian databases (415). This robust integration puts the expression data annotated in GXD into a much larger biological and analytical context. Thus, users are able to query using extensive genetic, sequence, functional, expression and phenotypic information. Other public and laboratory databases have been developed in recent years to store mouse expression data (1626). They store data from one or two specific assay types and/or focus on specific tissues/developmental stages; they are often dedicated to specific data generation projects. These databases are complementary to the GXD effort. Due to its broad scope, its thorough approach and its data integration and querying capabilities, though, GXD provides a unique resource to the biomedical research community. New data are entered and made publicly available on a daily basis. GXD and its query interfaces have been described previously (2730). Here we focus on recent progress in terms of data acquisition and querying capabilities. The Gene Expression Literature Index GXD curators survey journals to find all published papers that describe endogenous gene expression and knock-in reporter studies done in the embryonic mouse. In a first annotation step, the curators record the genes and ages analyzed and the expression assay types used in these publications. GXD combines these data with information obtained from PubMed and makes them available for searching via the Gene Expression Literature Index. Therefore, users can query for specific types of expression information in combination with bibliographic information (author, journal, year) or specific words in the title or abstract of publications. The Literature Index is comprehensive and up-to-date; it contains all pertinent journal articles from 1993 to the present and articles from major developmental journals from 1990 to the present. Currently, the index contains >56 500 entries covering nearly 12 300 references analyzing nearly 8700 genes. Thus, it provides a powerful tool to quickly locate expression information in the literature. Gene expression data GXD currently collects detailed expression data from the following assay types: RNA in situ hybridization, immunohistochemistry, in situ reporter (knock in), northern blot, RTPCR, western blot, RNase protection and nuclease S1 protection studies. Work is underway to incorporate microarray data as well. As illustrated in Figure 1, expression records in GXD are detailed. Each entry contains a description of the assay type and the molecular probe used in the assay, the genetic origin of the sample and the experimental conditions used. The time and tissue of expression, the authors description of pattern and strength of expression, the number and sizes of detected bands and sequence information are also recorded. Expression patterns are described using an extensive dictionary of standardized anatomical terms that lists the anatomical structures for each developmental stage in a hierarchical fashion, thus enabling the recording of expression results from assays with different spatial resolution in a consistent manner. The embryonic part of the anatomical dictionary was developed by our collaborators from the Edinburgh Mouse Atlas and Gene Expression Database (EMAGE) project (31); the adult part was developed by the GXD project (32). As well as enabling complex querying capabilities, these detailed annotations make it easier to interpret and compare expression data. GXDs data content has increased significantly in recent years (Figure 2). Currently, GXD contains data from >24 600 assays that provide >260 000 detailed expression results for nearly 7700 genes, including expression data from almost 1000 different mouse mutants. Two-thirds of these data are linked to images of the primary expression data; GXD currently contains >43 000 images of expression data. This rapid growth in data content was made possible by the daily annotation of expression data from the literature and through the incorporation of large sets of expression data from large-scale RNA in situ hybridization and RTPCR screens. Recently acquired large dat (...truncated)


This is a preview of a remote PDF: https://nar.oxfordjournals.org/content/35/suppl_1/D618.full.pdf
Article home page: http://nar.oxfordjournals.org/content/35/suppl_1/D618.abstract

Constance M. Smith, Jacqueline H. Finger, Terry F. Hayamizu, Ingeborg J. McCright, Janan T. Eppig, James A. Kadin, Joel E. Richardson, Martin Ringwald. The mouse Gene Expression Database (GXD): 2007 update, Nucleic Acids Research, 2007, pp. D618-D623, 35/suppl 1, DOI: 10.1093/nar/gkl1003