ONTO-PERL: An API for supporting the development and analysis of bio-ontologies
Erick Antezana
1
2
Mikel Egan a
0
Bernard De Baets
3
Martin Kuiper
1
2
Vladimir Mironov
1
2
Associate Editor: Alex Bateman
0
University of Manchester, School of Computer Science
, Oxford Road, M13 9PL Manchester,
UK
1
Department of Molecular Genetics, Ghent University
, Technologiepark 927, 9052 Gent,
Belgium
2
Department of Plant Systems Biology
, VIB
3
Department of Applied Mathematics
, Biometrics and Process Control,
Ghent University
, Computer links 653, 9000 Gent,
Belgium
Motivation: Many biomedical ontologies use OBO or OWL as knowledge representation language. The rapid increase of such ontologies calls for adequate tools to facilitate their use. In particular, there is a pressing need to programmatically deal with such ontologies in many applications, including data integration, text mining, as well as semantic applications supporting translational research. Results: We present an Application Programming Interface (API) called ONTO-PERL. This API significantly extends the repertoire of available tools supporting the development and analysis of bioontologies. Availability: The source code code as well as sample usage scripts can be found at: http://search.cpan.org/dist/ONTO-PERL/ Contact:
-
INTRODUCTION
Ontologies support consistent and unambiguous knowledge
sharing and provide a framework for knowledge integration.
More specifically, ontologies represent the agreed knowledge
about a domain of discourse. The knowledge is represented by
creating a single model with the terms of the domain as well as
the relationships between those terms (Stevens et al., 2007). The
relationships between terms effectively define what properties a
given term must have. Entities are also linked to human
readable information like labels. Thus, an ontology links term
labels to their interpretations, i.e. specifications of their
meanings, defined as a set of properties. As such, ontologies
can be used to support automatic semantic interpretation of
textual information, thereby providing a basis for advanced
text mining (Doms et al., 2005; Mu ller et al., 2004). Moreover,
structured and integrated knowledge provides a basis for
advanced reasoning to validate hypotheses and generate new
knowledge (Blake et al., 2006; Myhre et al., 2006). Reasoning
services can be used to re-engineer the design of parts of
the whole ontology (such as classification) or to design entirely
new extensions that comply with the current knowledge
*To whom correspondence should be addressed.
(Wolstencroft et al., 2007). All these scenarios and applications
need foundational tools to deal with ontologies.
OBO1 and OWL2 are becoming the de facto knowledge
representation languages in the biomedical domain. OBO is
human readable and it has gained wide acceptance. Many
ontologies, such as GO (The Gene Ontology Consortium, 2000),
are expressed in OBO. However, OBO does not have an explicit
and well-defined semantics. In contrast, OWL is computer
readable since it does have such a semantics, and, hence,
automated reasoning can be performed on OWL ontologies.
Several tools are currently available to manage and develop
OBO and OWL ontologies, either in the form of ontology
editors or APIs. Within the bio-ontology community,
OBOEdit Day-Richter07 (OBO-centered) and Prote ge 3
(OWLcentered) are the most frequently used ontology-building
environments. Prote g e also has a plug-in for loading OBO
ontologies (Moreira et al., 2007). Both ontology editors offer
open java APIs that can be used to build applications and
explore bio-ontologies. There also exist some independent APIs
(or API-like tools) in java and perl. In java, OWL or OBO
ontologies can be loaded and managed with the OWL API.4
In PERL, go-perl,5 GO::Term::Finder (Boyle et al., 2004) and
Bio::Ontology6 are available. go-perl and GO::Term::Finder
are GO-specific, and therefore many bio-ontologies, such as
those under the OBO foundry,7 cannot be handled easily
without tweaking the code. Bio::Ontology is not GO-specific
but it lacks important functionalities, for instance, to intersect
two ontologies, unify ontologies, export to different formats
(OWL, XML, DOT, etc). Moreover, it lacks modularity in
annotations (such as def, synonym and dbxref). Therefore, we
present ONTO-PERL, an OBO-centered PERL API that
provides a turnkey service to help bio-ontologists handle
ontologies, do data exploration and perform mining.
1http://www.geneontology.org/GO.format.obo-1_2.shtml
2http://www.w3.org/TR/owl-features/
3http://protege.stanford.edu/
4http://owlapi.sourceforge.net/
5http://amigo.geneontology.org/dev/go-perl/doc/go-perl-doc.html
6http://search.cpan.org/dist/bioperl/
7http://obofoundry.org/
Fig. 1. Simplified object model of ONTO-PERL.
2 IMPLEMENTATION
ONTO-PERL comprises an extensible set of object-oriented
PERL modules that can be used for programmatically working
with ontologies. ONTO-PERL can be installed as any typical
CPAN module.8 A set of comprehensive test files is included in
the distributi (...truncated)