Aber-OWL: a framework for ontology-based data access in biology (pdf)

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/s12859-015-0456-9.pdf

Aber-OWL: a framework for ontology-based data access in biology

Hoehndorf et al. BMC Bioinformatics Aber-OWL: a framework for ontology-based data access in biology Robert Hoehndorf 0 1 Luke Slater 0 1 2 Paul N Schofield 3 Georgios V Gkoutos 2 0 Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology , 4700 KAUST, 23955-6900 Thuwal , Saudi Arabia 1 Computational Bioscience Research Center, King Abdullah University of Science and Technology , 4700 KAUST, 23955-6900 Thuwal , Saudi Arabia 2 Department of Computer Science, Aberystwyth University , Llandinam Building, SY23 3DB Aberystwyth , UK 3 Department of Physiology, Development & Neuroscience, University of Cambridge , Downing Street, CB2 3EG Cambridge , UK Background: Many ontologies have been developed in biology and these ontologies increasingly contain large volumes of formalized knowledge commonly expressed in the Web Ontology Language (OWL). Computational access to the knowledge contained within these ontologies relies on the use of automated reasoning. Results: We have developed the Aber-OWL infrastructure that provides reasoning services for bio-ontologies. Aber-OWL consists of an ontology repository, a set of web services and web interfaces that enable ontology-based semantic access to biological data and literature. Aber-OWL is freely available at http://aber-owl.net. Conclusions: Aber-OWL provides a framework for automatically accessing information that is annotated with ontologies or contains terms used to label classes in ontologies. When using Aber-OWL, access to ontologies and data annotated with them is not merely based on class names or identifiers but rather on the knowledge the ontologies contain and the inferences that can be drawn from it. Ontology-based data access; Linked data; OWL - Background A large number of ontologies have been developed for the annotation of biological and biomedical data, commonly expressed in the Web Ontology Language (OWL) [1] or an OWL-compatible language such as the OBO Flatfile Format [2]. Access to the full extent of knowledge contained in ontologies is facilitated by automated reasoners that can compute the ontologies underlying taxonomy and answer queries over the ontology content. While ontology repositories, such as BioPortal [3] and the Ontology Lookup Service (OLS) [4], provide web services and interfaces to access ontologies, including their metadata such as author names and licensing, the list of classes and asserted structure, they do not enable computational access to the semantic content of the ontologies and the inferences that can be drawn from them. Access to the semantic content of ontologies usually requires further inferences to reveal the consequences of statements (axioms) asserted in an ontology; these consequences may be automatically derived using an automated reasoner. To the best of our knowledge, no reasoning infrastructure that supports semantically enabled access to biological and biomedical ontologies currently exists. Here, we present Aber-OWL, a reasoning infrastructure over ontologies consisting of an ontology repository, web services that facilitate semantic queries over ontologies specified by a user or contained in Aber-OWLs repository, and a user interface. Such an infrastructure can not only enable access to knowledge contained in ontologies, but crucially can also be used for semantic queries over data annotated with ontologies, including the large volumes of data that are increasingly becoming available through public SPARQL endpoints [5]. Allowing access to data through an ontology is known as the ontologybased data access paradigm [6,7], and can exploit formal information contained in ontologies to: identify possible inconsistencies and incoherent descriptions [8], enrich possibly incomplete data with background knowledge so as to obtain more complete answers to a query (e.g., if a data item referring to an organism has been characterized with multiple findings that together constitute a disease, then the data item can be returned when querying for the disease even in the absence of it being explicitly declared in a database) [6,9], enrich the data schema used to query data sources with additional information (e.g., by using a class in a query that is an inferred super-class of one or more classes that are used to annotate data items, but the class itself is never used to characterize data) [6], and provide a uniform view over multiple data sources with possibly heterogeneous, multi-modal data [6,7]. To demonstrate how Aber-OWL can be used for ontology-based access to data, we provide a service that performs a semantic search over PubMed and PubMed Central articles using the results of an Aber-OWL query, and a service that performs SPARQL query extension so that the results of Aber-OWL queries can be used to retrieve data accessible through public SPARQL endpoints. In Aber-OWL, following the ontology-based data access paradigm [6,7], we specify the features of the relevant information on the ontology- and knowledge level [10], and retrieve named classes in ontologies satisfying these condition using an automated reasoner, i.e., a software program that can identify whether a class in an ontology satisfies certain conditions based on the axioms specified in an ontology. Subsequently, we embed the resulting information in database, Linked Data or literature queries. Aber-OWL can be accessed at http://aber-owl.net. The Aber-OWL software is freely available at https://github. com/reality/SparqOWL can be installed locally by users who want to provide semantic access to their own ontologies and support the use of their ontologies in semantic queries. Methods Aber-OWL The Aber-OWL software can be configured with a list of URIs that contain ontology documents (i.e., OWL files) and employs the OWL API [11] to retrieve the ontologies that are to be included in the repository. For each ontology document included in the repository, the labels and definitions of all classes contained within the ontology (as well as of all the ontologies it imports) are identified based on OBO Foundry standards and recommendations: we use the rdfs:label annotation property to identify class labels for each ontology (as well as of all the ontologies it imports), and we employ the definition (http:// purl.obolibrary.org/obo/IAO_0000115) annotation property, defined in the Information Artifact Ontology, to identify the text definitions of a class. Labels of the classes occurring in each ontology, as well as of all the ontologies it imports, are stored in a trie (prefix tree). The use of a trie ensures that class labels can be searched efficiently, for example when providing term completion recommendations. Upon initiating the Aber-OWL web services, we classify each ontology using the ELK reasoner [12], i.e., we identify the most specific sub- and super-classes for each class contained in the ontology using the axioms contained within it. The (...truncated)