The CellML Model Repository
Catherine M. Lloyd
0
James R. Lawson
0
Peter J. Hunter
0
Poul F. Nielsen
0
Associate Editor: Jonathan Wren
0
Auckland Bioengineering Institute, The University of Auckland
, Auckland 1010,
New Zealand
Summary: The CellML Model Repository provides free access to over 330 biological models. The vast majority of these models are derived from published, peer-reviewed papers. Model curation is an important and ongoing process to ensure the CellML model is able to accurately reproduce the published results. As the CellML community grows, and more people add their models to the repository, model annotation will become increasingly important to facilitate data searches and information retrieval. Availability: The CellML Model Repository is publicly accessible at http://www.cellml.org/models Contact: The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: D o w n l o a d e d f r o m h t t p : / / b i o i n f o r m a t i c s . o x f o r d j o u r n a .l s o r g / b y g u e s t o n O c t o b e r 2 6 , 2 0 1 4
1 INTRODUCTION
High throughput experimental techniques have led to the population
of web-accessible databases with vast amounts of biological data.
Mathematical models of biological systems are playing an essential
role in the interpretation of this data. The scientific community
now faces the challenge of the mathematical models themselves
becoming increasingly complex and numerous. There is a need
for centralized databases to store all these models in standard
formats to make them easily accessible and reusable by the research
community. Publishing the models in a standard format, concurrent
with the submission of a written paper, will eliminate many
of the errors introduced into the model during the publication
process. Here we introduce the CellML Model Repository
(http://www.cellml.org/models) and discuss it as a solution to these
challenges. The BioModels database (Le Novere et al., 2006)
is a similar effort, containing biochemical pathway models that
have been described in peer-review publications, expressed in
SBML (Hucka et al., 2003). Similarly, JWS Online (Olivier and
Snoep, 2004) is a repository of kinetic models describing biological
systems, and ModelDB (Hines et al., 2004) is a database which
stores published models in the field of computational neuroscience.
CellML (Lloyd et al., 2004) and the CellML Model Repository are
part of the IUPS Physiome Project (Hunter and Nielsen, 2005) effort
to create a virtual physiological human. The explicit representation
of modularity, together with the flexible nature of the CellML
language which allows the description of a diverse range of cellular
and subcellular systems, are two essential features of CellML with
regards to its role in the Physiome Project.
Initially the CellML Model Repository started out as a set of
examples to illustrate how the language could be applied to describe
various biological processes, and to test its features as the language
evolved. Later, once the CellML 1.0 specification was stabilized,
the CellML repository became a collection of CellML descriptions
of models drawn from peer-reviewed journal publications. The
CellML Model Repository has since undergone significant growth,
with over 330 freely available, quantitative models of biological
processes taken from the peer-reviewed literature. In contrast with
other databases, such as BioModels, JWS and ModelDB, which
focus on specific areas such as systems biology pathway models
or computational neuroscience, the CellML Model Repository
contains models describing a wide range of biological processes,
including: signal transduction pathways, metabolic pathways,
electrophysiology, immunology, the cell cycle, muscle contraction
and mechanical models and constitutive laws. This wide scope
exemplifies CellMLs ability to describe much of the biochemistry,
electrophysiology and mechanics of the intracellular environment.
Lumped parameter models dealing with systems physiology (e.g.
blood pressure control, fluid retention, electrolyte balance, endocrine
function, etc.) are also within the scope of CellML.
MODEL CURATION
Currently, of the 330 models in the CellML Model Repository,
approximately half have been curated to some degree. A star system
signifies the curation status of a CellML model. No stars indicate
the model has yet to be curated (level 0); one star denotes the
CellML model is consistent with the published paper (level 1); two
stars imply the CellML model has been checked for typographical
errors, unit consistency, completeness (i.e. there are no missing
parameters or equations), overconstraints and finally, and arguably
most importantly, the CellML model is capable of reproducing the
published results (level 2). If a CellML model has three stars it is
known to satisfy physical constraints such as conservation of mass,
momentum, charge, etc. At this level the curation is conducted by a
domain expert (level 3).
From experience, we have found that levels 1 and 2 can be
mutually exclusive. Frequently, the errors introduced into the
model during the publication process require us to correct minor
typographical errors or unit inconsistencies, and/or contact the
original model author to request missing parameter values or
equations.
The process of model curation involves the following sequence
of actions:
The CellML model is loaded into an editing and simulation
environment such as the Physiome CellML Environment
(PCEnv) or Cellular Open Resource (COR). Any obvious
typographical errors and unit inconsistencies are corrected,
which is facilitated by a series of error messages and validation
prompts generated by the software, and the rendering of the
MathML equations in an easily readable format.
Assuming the model is able to be run, we then compare the
simulation output with the results in the published paper
this typically involves comparing the graphical results with the
published figures.
If we cannot get the CellML model to run, or the simulation
output disagrees with the published results, we then attempt to
contact the original model author(s) and seek their advice and,
where possible, obtain the original model code, which may be
in a wide range of different programming languages.
We aim to complete the curation of all the models in the CellML
repository, ideally to the level that they replicate the results in the
published paper (level 2), however we acknowledge this will not be
possible for all models. According to the dynamic, growing nature of
the CellML Model Repository, we have designed it with the concept
of community curation in mind, so that groups of expert modellers
with vested interests in particular models are able to collaborate on
their curation.
MODEL ANNOTATION
Metadata, the extra information associated with a model, are
embedded in CellML using the W3C approved RDF standard. In
order for a CellML model to be committed to the repository, at
the very leas (...truncated)