Integrating mouse anatomy and pathology ontologies into a phenotyping database: Tools for data capture and training
John P. Sundberg
0
1
Beth A. Sundberg
0
1
Paul Schofield
0
1
0
P. Schofield Department of Physiology Development and Neuroscience, University of Cambridge
, Cambridge CB2 3DY,
UK
1
J. P. Sundberg (&) B. A. Sundberg The Jackson Laboratory
, 600 Main Street, Bar Harbor,
ME 04609-1500, USA
The Mouse Disease Information System (MoDIS) is a data capture system for pathology data from laboratory mice designed to support phenotyping studies. The system integrates the mouse anatomy (MA) and mouse pathology (MPATH) ontologies into a Microsoft Access database facilitating the coding of organ, tissue, and disease process to recognized semantic standards. Grading of disease severity provides scores for all lesions that can then be used for quantitative trait locus (QTL) analyses and haplotype association gene mapping. Direct linkage to the Pathbase online database provides reference definitions for disease terms and access to photomicrographic images of similar diagnoses in other mutant mice. MoDIS is an open source and freely available program (http://research.jax.org/ faculty/sundberg/index.html). This provides a valuable tool for setting up a mouse pathology phenotyping program.
-
The relationship between genetic variation and phenotype
is at the heart of the model organism approach to the study
of human disease. In recent years the mouse has become
the model organism of choice for the study of human
disease, partly as a consequence of its physiologic and
genomic similarities, but also because of the developments
in mouse genetics, that now provide powerful tools for the
manipulation of the mouse genome (Rosenthal and Brown
2007). The last five years have also seen rapid advances in
the instrumentation and technology available for detailed
phenotyping, and these factors together provide enormous
potential for the advancement of our understanding of gene
function in health and disease.
The torrent of phenotype data currently being generated
from both gene-driven and phenotype-driven experimental
approaches to functional genomics will accelerate over
the next few years. With the accumulation of data now
emerging from the large ethyl-nitrosourea (ENU)
mutagenesis projects (Auwerx et al. 2004) and the ambitious
whole mouse genome mutagenesis projects represented by
the International Knockout Mouse Consortium (Collins
et al. 2007), there is the risk that this will overwhelm our
ability to retain, share, and exploit the resulting
information. The challenges presented by the collection and
analysis of this volume of phenotype data are
unprecedented, not only because of the quantity, but also the range
and depth of the information. This requires specifically
tailored approaches to the capture and representation of
radically different types of data, for example, craniofacial
morphology and blood chemistry (Brown et al. 2006;
Gkoutos et al. 2005). The dominant approach to this set of
problems is exemplified by that adopted by the
EUMORPHIA consortium using EmPRESS (Green et al. 2005),
where phenotype is represented by a standard assay, which
then defines a set of measurements or descriptions derived
from formal description frameworks and ontologies
(Mallon et al. 2008). The power of this approach is that it allows
for high-resolution data to be captured on individual mice
for one or more assays and then combined to provide data
that can be compared with that from background or control
strains. Relating this accumulated variant phenotype data
to genetic information is then a matter for new
computational tools and resources, many of which are newly
available or under development (Chen et al. 2007; Groth
et al. 2007; Swertz et al. 2004).
Crucial to the utility of this data is that it is presented in a
formalized way to facilitate data sharing, which requires that
databases use standard data structures and semantics.
Currently, two databases present raw data for individual mouse
strains: the Mouse Phenome Database (Bogue et al. 2007)
(http://www.jax.org/phenome) and the EuroPhenome
Database (http://www.europhenome.org) (Mallon et al. 2008).
Pathology is an essential aspect of phenotyping that
requires labor-intensive workup and detailed knowledge of
laboratory mouse anatomy, physiology, and genetics to be
fully effective. There are two major problems with
recording this aspect of phenotype: standardization of
pathology data, and the availability of pathology expertise
to derive and interpret that data. The latter is a
wellrecognized problem: The importance of pathology in
mouse phenotyping cannot be underestimated. However,
the laborious nature of pathology analysis and the
dependence on a small cadre of experts continues to represent a
significant stumbling block to unraveling the mouse
phenome (Brown et al. 2006). Such expertise is not easy to
find (Barthold et al. 2007; Cardiff et al.2008; Valli et al.
2007) and the perils of DIY pathology are well
illustrated in the article by Cardiff et al. (2008). The gold
standard is represented in the systematic pathology
segment of the German Mouse Clinics phenotyping process
where there is standardized morphologic phenotyping of
potential mouse models (Mossbrugger et al. 2007).
The depth of data captured, data structure, and description
semantics are not yet fully standardized and require not only
community agreement on the minimal information needed to
record a phenotype but also data capture tools that allow for
rapid and accurate recording of data in a form in which it may
be uploaded to central databases (Mouse Phenotype
Database Integration Consortium 2007). The terminology for
lesions in widespread use is a mixture of veterinary and
human diagnostic names that do not always correspond,
although recent recommendations by the Mouse Models of
Human Cancer Consortium (MMHCC) have gone some way
toward standardization of nomenclature for neoplastic
diseases (Cardiff et al. 2000; Kogan et al. 2002; Nikitin et al.
2004a, b; Shappell et al. 2004). Unfortunately, adoption of
these recommendations has been slow among pathologists
working in different environments and traditions. Much
needed resources are being developed to provide standard
reference vocabularies for mouse anatomy at the gross level
(mouse anatomy ontology) and disease processes (mouse
pathology ontology) useful at both the gross and microscopic
levels. Integration of these with annotated and labeled line
drawings, gross photographs or photomicrographs, and
literature references provides tools that can be rapidly used
for reference and for training the next generation of mouse
specialist pathologists. These are adjuncts to, not
replacements for, traditional training and mentorship approaches
(Barthold et al. 2007; Sundberg et al. 2004). Unfortunately
these types of resources are spread all over the world at many
different institutions and if online are often unlinked.
Diagnostic laboratories face record-keeping problems
that can be overwhelming. Using traditional approaches to
d (...truncated)