Integrating mouse anatomy and pathology ontologies into a phenotyping database: Tools for data capture and training (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs00335-008-9123-z.pdf

Integrating mouse anatomy and pathology ontologies into a phenotyping database: Tools for data capture and training

John P. Sundberg 0 1 Beth A. Sundberg 0 1 Paul Schofield 0 1 0 P. Schofield Department of Physiology Development and Neuroscience, University of Cambridge , Cambridge CB2 3DY, UK 1 J. P. Sundberg (&) B. A. Sundberg The Jackson Laboratory , 600 Main Street, Bar Harbor, ME 04609-1500, USA The Mouse Disease Information System (MoDIS) is a data capture system for pathology data from laboratory mice designed to support phenotyping studies. The system integrates the mouse anatomy (MA) and mouse pathology (MPATH) ontologies into a Microsoft Access database facilitating the coding of organ, tissue, and disease process to recognized semantic standards. Grading of disease severity provides scores for all lesions that can then be used for quantitative trait locus (QTL) analyses and haplotype association gene mapping. Direct linkage to the Pathbase online database provides reference definitions for disease terms and access to photomicrographic images of similar diagnoses in other mutant mice. MoDIS is an open source and freely available program (http://research.jax.org/ faculty/sundberg/index.html). This provides a valuable tool for setting up a mouse pathology phenotyping program. - The relationship between genetic variation and phenotype is at the heart of the model organism approach to the study of human disease. In recent years the mouse has become the model organism of choice for the study of human disease, partly as a consequence of its physiologic and genomic similarities, but also because of the developments in mouse genetics, that now provide powerful tools for the manipulation of the mouse genome (Rosenthal and Brown 2007). The last five years have also seen rapid advances in the instrumentation and technology available for detailed phenotyping, and these factors together provide enormous potential for the advancement of our understanding of gene function in health and disease. The torrent of phenotype data currently being generated from both gene-driven and phenotype-driven experimental approaches to functional genomics will accelerate over the next few years. With the accumulation of data now emerging from the large ethyl-nitrosourea (ENU) mutagenesis projects (Auwerx et al. 2004) and the ambitious whole mouse genome mutagenesis projects represented by the International Knockout Mouse Consortium (Collins et al. 2007), there is the risk that this will overwhelm our ability to retain, share, and exploit the resulting information. The challenges presented by the collection and analysis of this volume of phenotype data are unprecedented, not only because of the quantity, but also the range and depth of the information. This requires specifically tailored approaches to the capture and representation of radically different types of data, for example, craniofacial morphology and blood chemistry (Brown et al. 2006; Gkoutos et al. 2005). The dominant approach to this set of problems is exemplified by that adopted by the EUMORPHIA consortium using EmPRESS (Green et al. 2005), where phenotype is represented by a standard assay, which then defines a set of measurements or descriptions derived from formal description frameworks and ontologies (Mallon et al. 2008). The power of this approach is that it allows for high-resolution data to be captured on individual mice for one or more assays and then combined to provide data that can be compared with that from background or control strains. Relating this accumulated variant phenotype data to genetic information is then a matter for new computational tools and resources, many of which are newly available or under development (Chen et al. 2007; Groth et al. 2007; Swertz et al. 2004). Crucial to the utility of this data is that it is presented in a formalized way to facilitate data sharing, which requires that databases use standard data structures and semantics. Currently, two databases present raw data for individual mouse strains: the Mouse Phenome Database (Bogue et al. 2007) (http://www.jax.org/phenome) and the EuroPhenome Database (http://www.europhenome.org) (Mallon et al. 2008). Pathology is an essential aspect of phenotyping that requires labor-intensive workup and detailed knowledge of laboratory mouse anatomy, physiology, and genetics to be fully effective. There are two major problems with recording this aspect of phenotype: standardization of pathology data, and the availability of pathology expertise to derive and interpret that data. The latter is a wellrecognized problem: The importance of pathology in mouse phenotyping cannot be underestimated. However, the laborious nature of pathology analysis and the dependence on a small cadre of experts continues to represent a significant stumbling block to unraveling the mouse phenome (Brown et al. 2006). Such expertise is not easy to find (Barthold et al. 2007; Cardiff et al.2008; Valli et al. 2007) and the perils of DIY pathology are well illustrated in the article by Cardiff et al. (2008). The gold standard is represented in the systematic pathology segment of the German Mouse Clinics phenotyping process where there is standardized morphologic phenotyping of potential mouse models (Mossbrugger et al. 2007). The depth of data captured, data structure, and description semantics are not yet fully standardized and require not only community agreement on the minimal information needed to record a phenotype but also data capture tools that allow for rapid and accurate recording of data in a form in which it may be uploaded to central databases (Mouse Phenotype Database Integration Consortium 2007). The terminology for lesions in widespread use is a mixture of veterinary and human diagnostic names that do not always correspond, although recent recommendations by the Mouse Models of Human Cancer Consortium (MMHCC) have gone some way toward standardization of nomenclature for neoplastic diseases (Cardiff et al. 2000; Kogan et al. 2002; Nikitin et al. 2004a, b; Shappell et al. 2004). Unfortunately, adoption of these recommendations has been slow among pathologists working in different environments and traditions. Much needed resources are being developed to provide standard reference vocabularies for mouse anatomy at the gross level (mouse anatomy ontology) and disease processes (mouse pathology ontology) useful at both the gross and microscopic levels. Integration of these with annotated and labeled line drawings, gross photographs or photomicrographs, and literature references provides tools that can be rapidly used for reference and for training the next generation of mouse specialist pathologists. These are adjuncts to, not replacements for, traditional training and mentorship approaches (Barthold et al. 2007; Sundberg et al. 2004). Unfortunately these types of resources are spread all over the world at many different institutions and if online are often unlinked. Diagnostic laboratories face record-keeping problems that can be overwhelming. Using traditional approaches to d (...truncated)