Semantic Web meets Integrative Biology: a survey
B RIEFINGS IN BIOINF ORMATICS . VOL 14. NO 1. 109^125
Advance Access published on 6 April 2012
doi:10.1093/bib/bbs014
Semantic Web meets Integrative
Biology: a survey
Huajun Chen, Tong Yu and JakeY. Chen
Submitted: 27th December 2011; Received (in revised form) : 18th February 2012
Abstract
Keywords: semantic web; integrative biology; web ontology; web of data
INTRODUCTION
Integrative Biology (IB) lies at the intersection of a
multitude of scientific and technological disciplines,
and focuses on bridging the gap between different
disciplines and the wedding of technological advances
to biological insight. In the 1980s it was recognized that
biology bounded by traditional disciplines no longer
reflected the best way to do science, which created new
Corresponding author. Huajun Chen, College of Computer Science, Zhejiang University, Hangzhou, 310027, P.R. China.
Tel: 86-571-87953703; Fax: 86-571-87953079; E-mail: ; Tong Yu, College of Computer Science,
Zhejiang University, Hangzhou, 310027, P.R. China. Tel: 86-571-87953703; Fax: 86-571-87953079; E-mail: ; Jake Y. Chen, Walker Plaza Building (WK), Suite #190, 719 N. Indiana Ave Indianapolis, IN 46202, USA (317) 2787604. E-mail:
Huajun Chen is an associate professor of college of computer science, Zhejiang University. His major research interests include the
Semantic Web, Ontologies, Biomedical Informatics and Traditional Chinese Medicine Informatics. He is particularly active in
researches on the applications of the Semantic Web technologies in Life Sciences and Healthcares. He was the chair or co-chair of
WWW2007/WW2008’s workshop on Semantic Web for Health Care and Life Science. He was the guest editors for several relevant
special issues including BMC Bioinformatics special issue on ‘Semantic e-Science for Biomedicine’ (2007), Journal of Biomedical Informatics
special issue on ‘Semantic BioMed Mashup’ (2008), CurrentBioinformatics special issue on ‘Semantic Web meets Current Bioinformatics’
(2012). He was an invited expert of W3C’s HCLS IG group. He is the executive member of the council of the Information
Committee of World Federation of Chinese Medicine Societies.
Tong Yu is a PhD candidate of Zhejiang University. His major interests include the Semantic Web, bioinformatics and integrative
biomedicine.
Jake Y. Chen is an associate professor of Informatics and Computer Science, Indiana University School of Informatics and Purdue
University, Department of Computer & Information Science. He is the founding director of Indiana Center for Systems Biology and
Personalized Medicine, and the advisory committee members of IU School of Medicine Translational Genomics Core IU Center for
Environmental Health. He is the chair of Engineering in Medicine & Biology Society, IEEE Central Indiana Section (since 2005),
steering committee and co-founder of Indiana Biomedical Entrepreneur Network (since 2004), systems biology chair and proteomics
chair of the Life Sciences Society (since 2005), also serves as board member and vice president of association of Chinese bioinformaticians, (since 2001). His primary research areas: Translational Bioinformatics, Computational Systems Biology, Scientific Data
Management and Data Mining, Semantic Web and Ontologies.
ß The Author 2012. Published by Oxford University Press. For Permissions, please email:
Integrative Biology (IB) uses experimental or computational quantitative technologies to characterize biological
systems at the molecular, cellular, tissue and population levels. IB typically involves the integration of the data, knowledge and capabilities across disciplinary boundaries in order to solve complex problems. We identify a series of
bioinformatics problems posed by interdisciplinary integration: (i) data integration that interconnects structured
data across related biomedical domains; (ii) ontology integration that brings jargons, terminologies and taxonomies
from various disciplines into a unified network of ontologies; (iii) knowledge integration that integrates disparate knowledge elements from multiple sources; (iv) service integration that build applications out of services provided by different vendors. We argue that IB can benefit significantly from the integration solutions enabled by Semantic Web (SW)
technologies.The SW enables scientists to share content beyond the boundaries of applications and websites, resulting
into a web of data that is meaningful and understandable to any computers. In this review, we provide insight into
how SW technologies can be used to build open, standardized and interoperable solutions for interdisciplinary integration on a global basis. We present a rich set of case studies in system biology, integrative neuroscience,
bio-pharmaceutics and translational medicine, to highlight the technical features and benefits of SW applications in IB.
110
Chen et al.
solution, which is crucial to support the interdisciplinary integration.
The first truly global integration solution is the
World Wide Web. In 1990, Tim Berners-Lee
invented the Web, in support of the cross-boundary
information sharing and collaborative research in
CERN [9]. Since its inception, the World Wide
Web has changed the ways scientists communicate,
collaborate and educate [10]. The Web enables the
development and maintenance of cyber infrastructure
for e-Science, which facilitates data sharing and interdisciplinary collaborations on a global basis [11].
However, the current Web still lacks a widelyaccepted and standard way to publish and share structured data, leading to the difficulty of achieving global
data integration [12].
In order to fill the data gap on the Web, Tim
Berners-Lee et al. envisioned the Semantic Web
(SW) as a web of data that is meaningful and understandable to any computers [13, 14]. As they have
predicted, the Web of data will enable Web users to
share structured data as easy as they share documents,
photos and videos today. As shown in Figure 1, the
Web of data can be conceptualized as a global graph
of things, or the graph layer on top of the Web [15].
Intelligent agents can operate directly on the Web of
data in order to solve complex problems and accomplish intelligent tasks. This new layer leads to the
emergence of Web 3.0 applications, which use the
Web of data to augment the underlying Web
system’s functionalities such as information retrieval
and knowledge sharing [16].
Technically speaking, the SW is closely associated
with the notion of ‘ontology’, which refers a
computational model that can be used to explicitly
represent the meaning of terms and the relationships
between those terms [17–19]. The SW can support the
collaborative engineering of domain ontologies that are
shared by a community, and the use of ontologies to
describe Web resources including knowledge, data and
services. This approach not only enables digital
resources to be shared and interconnected beyond
the boundaries of applications and websites, but also
supports the impl (...truncated)