Toward a Checklist for Exchange and Interpretation of Data from a Toxicology Study
Jennifer M. Fostel
3
Lyle Burgoon
0
Craig Zwickl
2
Peter Lord
5
J. Christopher Corton
4
Pierre R. Bushel
j Michael Cunningham
k Liju Fan
jk Stephen W. Edwards
4
Susan Hester
4
James Stevens
2
Weida Tong
1
Michael Waters
3
ChiHae Yang
0
Raymond Tennantk
0
Department of Biochemistry and Molecular Biology, Michigan State University
,
East Lansing, Michigan 48824
1
National Center for Toxicological Research
,
Jefferson, Arkansas 72079
2
Lilly Research Laboratory
,
Greenfield, Indiana 46140
3
NIEHS, LMIT ITSS Contract
,
Research Triangle Park, North Carolina 27709-2233
4
National Health and Environmental Effects Research Laboratory, U.S. Environmental Protection Agency
,
Research Triangle Park, North Carolina 27711
5
Johnson and Johnson PRD
,
Raritan, New Jersey 08869
-
Data from toxicology and toxicogenomics studies are valuable,
and can be combined for meta-analysis using public data
repositories such as Chemical Effects in Biological Systems
Knowledgebase, ArrayExpress, and Gene Expression Omnibus. In order to
fully utilize the data for secondary analysis, it is necessary to have
a description of the study and good annotation of the accompanying
data. This study annotation permits sophisticated cross-study
comparison and analysis, and allows data from comparable subjects
to be identified and fully understood. The Minimal Information
About a Microarray Experiment Standard was proposed to permit
deposition and sharing of microarray data. We propose the first step
toward an analogous standard for a toxicogenomics/toxicology
study, by describing a checklist of information that best practices
would suggest be included with the study data. When the
information in this checklist is deposited together with the study data,
the checklist information helps the public explore the study data in
context of time, or identify data from similarly treated subjects, and
also explore/identify potential sources of experimental variability.
The proposed checklist summarizes useful information to include
when sharing study data for publication, deposition into a database,
or electronic exchange with collaborators. It is not a description of
how to carry out an experiment, but a definition of how to describe
an experiment. It is anticipated that once a toxicology checklist
is accepted and put into use, then toxicology databases can be
configured to require and output these fields, making it
straightforward to annotate data for interpretation by others.
This manuscript has been reviewed and approved for publication by the
Environmental Protection Agency but does not necessarily reflect the views of
the Agency. Mention of trade names or commercial products does not constitute
endorsement or recommendations for use.
1 To whom correspondence should be addressed at NIEHS, MD F1-05, PO
Box 12233, 111 Alexander Drive, Research Triangle Park, NC, 27709-2233.
E-mail: .
This article arises from the authors experience with
databases, data exchange, and interpretation of cross-study data. It
does not describe how to do a study, but rather defines useful
information to include in describing the data from the study
when it is published or exchanged with collaborators. The
checklist herein reflects best practice, i.e., the ideal annotation
to include with data to permit interpretation in the context of
the study. This checklist focuses on the biological information
needed to interpret data from a study, and thus is a logical
complement to the technology-focused checklists under
development now. With community use, we anticipate that the
minimum information required for interpretation of a study
will emerge from this checklist.
A biological or biomedical investigation is viewed as a
selfcontained unit of scientific enquiry. Investigations often
include studies of biological subjects, which are examined in situ
or in a laboratory, in observational or perturbational studies.
This proposal is to expand the data exchange checklist for
toxicology/toxicogenomics to reflect the fact that a
toxicological study can focus on any one of a number of different subject
types. Therefore, it makes more sense to create a checklist
tailored to the particular subject type and study design used
rather than to aim for a single toxicology study data
checklist. This is not intended to be a closed, complete list,
but rather to be an illustrative proposal and a living document,
so that as additional aspects of study and subject are found to
be critical to understanding a study, these pieces will be
included in this checklist, and in the data exchange associated
with it.
Databases such as the CEBS Chemical Effects in Biological
Systems Knowledgebase (Waters et al., 2003), ArrayTrack
(Tong et al., 2003, 2004), and dbZach (Burgoon et al., 2006) all
collect and store study data. At the moment only CEBS is a
public data repository, but all three database initiatives will
benefit from consensus around the minimal information needed
to interpret a study, and therefore from a checklist, such as in
this proposal, of the information important to include in a study
description. It is important to keep in mind that the data fields
included in the following tables are intended to be close to
the minimum data required for exchange and interpretation of
a biomedical study. The CEBS Data Dictionary (CEBS-DD;
Fostel et al., 2005) includes a longer checklist of additional
data which enriches interpretation of the study data, and
supports meta-analysis of data from multiple studies. The aim
of the CEBS-DD was to define the maximal set of data elements
that could be used to describe a study; this set is growing as
additional studies are deposited in CEBS. The current effort is
to identify the minimal set of data elements without which it is
difficult or impossible to interpret data from a study.
A number of standards and data exchange checklists
initiatives are currently underway. At the moment, these
initiatives are each focused on a specific technology, but do not
fully represent the accompanying biology. Examples include
microarray/transcriptomics (the Minimal Information About
a Microarray Experiment [MIAME]; Brazma et al., 2001;
Microarray Gene Expression Data [MGED] Society
Transcriptomics Working Group, http://fugo.sourceforge.net/
community/community.php), proteomics (Protein Standards
Initiative [PSI], http://psidev.sourceforge.net/), metabolomics/
metabonomics (Metabolomics Standards Initiative,
http://msiworkgroups.sourceforge.net/), in situ hybridization (Minimum
Information Specification For In Situ Hybridization and
Immunohistochemistry Experiments, http://scgap.systemsbiology.
net/standards/misfishie/), etc., or on a particular scientific
discipline such as nutrigenomics (MIAME-Nut; see http://www.
mged.org/Workgroups/rsbi/rsbi.html) or environmental work
(MIAME-Env; http://nebc.nox.ac.uk/miame/miame_env.html).
An early effort also created a MIAME-Tox (http://www.ebi.
ac.uk/microarray/doc/standards.html).
The Minimum Information (...truncated)