Aber-OWL: a framework for ontology-based data access in biology
Hoehndorf et al. BMC Bioinformatics
Aber-OWL: a framework for ontology-based data access in biology
Robert Hoehndorf 0 1
Luke Slater 0 1 2
Paul N Schofield 3
Georgios V Gkoutos 2
0 Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology , 4700 KAUST, 23955-6900 Thuwal , Saudi Arabia
1 Computational Bioscience Research Center, King Abdullah University of Science and Technology , 4700 KAUST, 23955-6900 Thuwal , Saudi Arabia
2 Department of Computer Science, Aberystwyth University , Llandinam Building, SY23 3DB Aberystwyth , UK
3 Department of Physiology, Development & Neuroscience, University of Cambridge , Downing Street, CB2 3EG Cambridge , UK
Background: Many ontologies have been developed in biology and these ontologies increasingly contain large volumes of formalized knowledge commonly expressed in the Web Ontology Language (OWL). Computational access to the knowledge contained within these ontologies relies on the use of automated reasoning. Results: We have developed the Aber-OWL infrastructure that provides reasoning services for bio-ontologies. Aber-OWL consists of an ontology repository, a set of web services and web interfaces that enable ontology-based semantic access to biological data and literature. Aber-OWL is freely available at http://aber-owl.net. Conclusions: Aber-OWL provides a framework for automatically accessing information that is annotated with ontologies or contains terms used to label classes in ontologies. When using Aber-OWL, access to ontologies and data annotated with them is not merely based on class names or identifiers but rather on the knowledge the ontologies contain and the inferences that can be drawn from it.
Ontology-based data access; Linked data; OWL
-
Background
A large number of ontologies have been developed for the
annotation of biological and biomedical data, commonly
expressed in the Web Ontology Language (OWL) [1] or
an OWL-compatible language such as the OBO Flatfile
Format [2]. Access to the full extent of knowledge
contained in ontologies is facilitated by automated reasoners
that can compute the ontologies underlying taxonomy
and answer queries over the ontology content.
While ontology repositories, such as BioPortal [3] and
the Ontology Lookup Service (OLS) [4], provide web
services and interfaces to access ontologies, including their
metadata such as author names and licensing, the list of
classes and asserted structure, they do not enable
computational access to the semantic content of the ontologies
and the inferences that can be drawn from them. Access
to the semantic content of ontologies usually requires
further inferences to reveal the consequences of statements
(axioms) asserted in an ontology; these consequences may
be automatically derived using an automated reasoner. To
the best of our knowledge, no reasoning infrastructure
that supports semantically enabled access to biological
and biomedical ontologies currently exists.
Here, we present Aber-OWL, a reasoning infrastructure
over ontologies consisting of an ontology repository, web
services that facilitate semantic queries over ontologies
specified by a user or contained in Aber-OWLs
repository, and a user interface. Such an infrastructure can
not only enable access to knowledge contained in
ontologies, but crucially can also be used for semantic queries
over data annotated with ontologies, including the large
volumes of data that are increasingly becoming available
through public SPARQL endpoints [5]. Allowing access
to data through an ontology is known as the
ontologybased data access paradigm [6,7], and can exploit formal
information contained in ontologies to:
identify possible inconsistencies and incoherent
descriptions [8],
enrich possibly incomplete data with background
knowledge so as to obtain more complete answers to
a query (e.g., if a data item referring to an organism
has been characterized with multiple findings that
together constitute a disease, then the data item can
be returned when querying for the disease even in the
absence of it being explicitly declared in a database)
[6,9],
enrich the data schema used to query data sources
with additional information (e.g., by using a class in a
query that is an inferred super-class of one or more
classes that are used to annotate data items, but the
class itself is never used to characterize data) [6], and
provide a uniform view over multiple data sources
with possibly heterogeneous, multi-modal data [6,7].
To demonstrate how Aber-OWL can be used for
ontology-based access to data, we provide a service that
performs a semantic search over PubMed and PubMed
Central articles using the results of an Aber-OWL query,
and a service that performs SPARQL query extension
so that the results of Aber-OWL queries can be used
to retrieve data accessible through public SPARQL
endpoints. In Aber-OWL, following the ontology-based data
access paradigm [6,7], we specify the features of the
relevant information on the ontology- and knowledge level
[10], and retrieve named classes in ontologies satisfying
these condition using an automated reasoner, i.e., a
software program that can identify whether a class in an
ontology satisfies certain conditions based on the axioms
specified in an ontology.
Subsequently, we embed the resulting information in
database, Linked Data or literature queries.
Aber-OWL can be accessed at http://aber-owl.net. The
Aber-OWL software is freely available at https://github.
com/reality/SparqOWL can be installed locally by users
who want to provide semantic access to their own
ontologies and support the use of their ontologies in semantic
queries.
Methods
Aber-OWL
The Aber-OWL software can be configured with a list of
URIs that contain ontology documents (i.e., OWL files)
and employs the OWL API [11] to retrieve the
ontologies that are to be included in the repository. For each
ontology document included in the repository, the labels
and definitions of all classes contained within the ontology
(as well as of all the ontologies it imports) are identified
based on OBO Foundry standards and recommendations:
we use the rdfs:label annotation property to identify
class labels for each ontology (as well as of all the
ontologies it imports), and we employ the definition (http://
purl.obolibrary.org/obo/IAO_0000115) annotation
property, defined in the Information Artifact Ontology, to
identify the text definitions of a class.
Labels of the classes occurring in each ontology, as well
as of all the ontologies it imports, are stored in a trie
(prefix tree). The use of a trie ensures that class labels can
be searched efficiently, for example when providing term
completion recommendations.
Upon initiating the Aber-OWL web services, we
classify each ontology using the ELK reasoner [12], i.e., we
identify the most specific sub- and super-classes for each
class contained in the ontology using the axioms
contained within it. The (...truncated)