Biological network motif detection and evaluation
Kim et al. BMC Systems Biology 2011, 5(Suppl 3):S5
http://www.biomedcentral.com/1752-0509/5/S3/S5
RESEARCH
Open Access
Biological network motif detection and evaluation
Wooyoung Kim1*, Min Li1,2*, Jianxin Wang2, Yi Pan1*
From BIOCOMP 2010 - The 2010 International Conference on Bioinformatics and Computational Biology
Las Vegas, NV, USA. 12-15 July 2011
Abstract
Background: Molecular level of biological data can be constructed into system level of data as biological
networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have
been used for many biological applications. Since network motif discovery involves computationally challenging
processes, previous algorithms have focused on computational efficiency. However, we believe that the biological
quality of network motifs is also very important.
Results: We define biological network motifs as biologically significant subgraphs and traditional network motifs are
differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGEGO-BNM,
EDGEBETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network
motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional
module and GO term clustering score in this paper. Experimental results show that EDGEGO-BNM and
EDGEBETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find
structural network motifs as well.
Conclusion: We provide new approaches to finding network motifs in biological networks. Our algorithms
efficiently detect biological network motifs and further improve existing algorithms to find high quality structural
network motifs, which would be impossible using existing algorithms. The performances of the algorithms are
compared based on our new evaluation measures in biological contexts. We believe that our work gives some
guidelines of network motifs research for the biological networks.
Background
Systems biology focuses on the study of complex interactions in biological systems, rather than the study of individual molecules such as DNA, RNA, proteins and
metabolites [1]. One of the goals of systems biology is
understanding the structures of all molecules and their
interactions in a system level. Therefore major challenges
are understanding the dynamic structures of small molecules and determining their functions in a living cell.
Various types of biological interactions have been
expressed in networks, which include transcriptional regulatory networks, signaling pathways, metabolic networks
and protein-protein interaction (PPI) networks. Biological
networks share some of structural properties of other
complex networks, or have specific features of scale-free
* Correspondence: ; ;
1
Department of Computer Science, Georgia State University, Atlanta, USA
Full list of author information is available at the end of the article
and small-world effect [2]. However, the properties have
been questioned by Lacroix et al. [3] with a number of
reasons including the incompleteness of networks and
inconsistent link generation for the graphs. Therefore,
the analysis extends to other network properties such as
network clusters and network motifs.
As biological networks are massive and the size is still
increasing, dividing the network into a number of clusters helps reveal specific local properties. Network
motif, as another concept describing local properties of
a network, is defined as a small connected subgraph
appearing frequently and uniquely in a network. Similar
to a protein sequence motif, network motif is defined as
a over-repeated pattern, but it requires much more
computation as the process involves isomorphic testing
and repeated processes for uniqueness determination.
Network alignment [4] and network querying [5] are
analogous to network motifs, but while network motifs
© 2011 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://
creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the
original work is properly cited.
Kim et al. BMC Systems Biology 2011, 5(Suppl 3):S5
http://www.biomedcentral.com/1752-0509/5/S3/S5
are defined with only structural information, network
alignment and network querying require both of the
topological and biological information. Previous network
motif discovery algorithms include exact counting and
approximation algorithms: Exhaustive recursive search
(ERS) [6], enumerate subgraphs (ESU) [7] and compact
topological motifs [8] are exact counting algorithms. For
efficient detection, several approximation algorithms
have been provided including edge sampling (MFINDER) [6], randomized version of ESU from a search tree
(RAND-ESU) [9], and tree-filtering search which is
NEMOFINDER[10]. Furthermore, parallel search algorithms have been developed to realize feasible exact
counting algorithms [11,12].
Network motifs are used for many applications in biological networks. Feed-forward-loop (FFL) and bifan network motifs are identified as the typical patterns in
different types of biological networks [13,14]. Przulj et al.
[15] used network motifs as a relative graphlet frequency
distance to distinguish different protein-protein interaction networks. Also motif frequencies are exploited as
classifiers for network model selection [16]. Milo et al.
[17] studied that networks of different biological and
technological domains have been classified into different
superfamilies on the basis of motif significance profiles.
To predict protein-protein interactions, Albert I. and
Albert R. [18] used network motifs successfully. In the
study by Conant and Wagner [19], network motifs in
transcriptional regulatory networks are not evolutionary
conserved while network motifs in PPI networks are evolutionary related. On the other hand, network motifs are
extended to ‘motif modes’ each of which has a certain
topology and a specific functional property [20].
Through a number of network motif applications,
however, we notice several problems regarding the biological meanings of network motifs, on top of the computational challenge for the detection. First, the
biological quality of network motifs are not validated
thoroughly. A network motif is selected only by its
structural uniqueness and just small number of
instances of the type are biologically exemplified. Second, only small portion of network motif instances are
used for applications and others are ignored. Third,
non-motifs, that is, structurally insignificant subgraphs,
have not been analyzed in any studies, which are filtered
out before applying to any applications. Fourth, it is still
questionable what the network motifs really represent in
biological networks.
As we believe that the biological quality of network
motifs are also significant, we (...truncated)