Fast Parallel All-Subgraph Enumeration Using Multicore Machines
Hindawi Publishing Corporation
Scientific Programming
Volume 2015, Article ID 901321, 11 pages
http://dx.doi.org/10.1155/2015/901321
Research Article
Fast Parallel All-Subgraph Enumeration Using
Multicore Machines
Saeed Shahrivari and Saeed Jalili
Computer Engineering Department, Tarbiat Modares University (TMU), Tehran 14115-111, Iran
Correspondence should be addressed to Saeed Jalili;
Received 28 January 2014; Revised 21 November 2014; Accepted 21 November 2014
Academic Editor: Przemyslaw Kazienko
Copyright © 2015 S. Shahrivari and S. Jalili. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
Enumerating all subgraphs of an input graph is an important task for analyzing complex networks. Valuable information can be
extracted about the characteristics of the input graph using all-subgraph enumeration. Notwithstanding, the number of subgraphs
grows exponentially with growth of the input graph or by increasing the size of the subgraphs to be enumerated. Hence, all-subgraph
enumeration is very time consuming when the size of the subgraphs or the input graph is big. We propose a parallel solution named
Subenum which in contrast to available solutions can perform much faster. Subenum enumerates subgraphs using edges instead of
vertices, and this approach leads to a parallel and load-balanced enumeration algorithm that can have efficient execution on current
multicore and multiprocessor machines. Also, Subenum uses a fast heuristic which can effectively accelerate non-isomorphism
subgraph enumeration. Subenum can efficiently use external memory, and unlike other subgraph enumeration methods, it is not
associated with the main memory limits of the used machine. Hence, Subenum can handle large input graphs and subgraph sizes that
other solutions cannot handle. Several experiments are done using real-world input graphs. Compared to the available solutions,
Subenum can enumerate subgraphs several orders of magnitude faster and the experimental results show that the performance of
Subenum scales almost linearly by using additional processor cores.
1. Introduction
Enumerating subgraphs of a given size has been shown to be a
very useful task in the area of complex network analysis. Subgraphs can be used to identify building blocks and functional
and nonfunctional characteristics in social, biological, chemical, and technological graphs [1]. An interesting application
is subgraph mining which can be used to extract functional
properties. A good example is finding network motifs, which
are defined as connected subgraphs that occur significantly
more frequently than expected [2]. One of the best known
approaches for finding network motifs is to enumerate all
subgraphs and then extract significant motifs after omitting
frequent subgraphs that occur in random networks [3].
There are also many other applications in areas like data
mining, statistics, systems biology, chemoinformatics, social
networks, telecommunications, and web mining.
Although subgraph enumeration is a useful task, it is a
computational challenging problem [4]. Enumeration can be
classified into two distinct problems: enumerating all labeled
subgraphs and enumerating nonisomorphic subgraphs, that
is, subgraphs that have identical structure but different
vertex labels. In the first problem, all of the subgraphs of
a given size should be enumerated. On the other hand,
in the second problem which is much more important, all
of the nonisomorphic subgraphs of a given size must be
enumerated. Both problems are very time consuming because
the number of both labeled and nonisomorphic subgraphs
increases exponentially by giving a bigger subgraph size or
a larger input graph for subgraph enumeration.
As the size of the input graph increases, the number
of subgraphs of size 𝑘 increases exponentially (in the worst
case 𝐶(𝑛, 𝑘) for a complete graph) [5]. The number of
nonisomorphic subgraphs, which can be calculated using the
Polya enumeration theorem [6], also increases exponentially
as 𝑘 increases. Therefore, by increasing the subgraphs size or
the input graph’s size, subgraph enumeration will take more
time. When nonisomorphic subgraphs are enumerated, the
problem becomes more complicated because an additional
mechanism must be used to identify isomorphic subgraphs.
2
Scientific Programming
There is no known polynomial algorithm for subgraph
isomorphism problem yet, and this overcomplicates the
subgraph enumeration problem [7].
Due to the complex nature of subgraph enumeration
problem, it is a very challenging and time-consuming problem. Available sequential algorithms tend to take a lot of time
to do the job [3]. Hence, a good solution is to use parallel
and distributed systems to accelerate subgraph enumeration
[8]. Several other recent works targeting parallel subgraph
enumeration have been proposed recently [8]. However,
most of the related works are based on message passing
interface (MPI) and hence are designed to work on cluster
computing systems [8, 9]. In contrast, our goal is to provide
a fast and easy to use tool for subgraph enumeration on
commodity multicore and multiprocessor machines and to
the best of our knowledge it has not yet been done. For
this reason, we present a parallel solution, named Subenum,
which is designed for faster and more scalable subgraph
enumeration on multicore and multiprocessor machines.
Subenum provides fast and efficient methods for counting
and dumping both all and just nonisomorphic subgraphs.
Subenum’s strength compared to other similar works can
be classified into three categories. First, we have presented
a new edge-based parallel subgraph enumeration algorithm
named PSE, which is an improved version of the well-known
sequential ESU algorithm. PSE provides a parallel and loadbalanced approach for subgraph enumeration. The second
strength is using a custom polynomial-time heuristic for
detecting isomorphic subgraphs. The last strength is using
a combination of external sorting and the nauty canonical
labeling algorithm which enables Subenum to enumerate
nonisomorphic subgraphs even when the number of subgraphs is so big that they cannot be stored in the main
memory.
For evaluating the performance of Subenum we have
performed several experiments on real-world graphs from
different areas like social network, biological networks,
software engineering, and electrical circuits. During the
experiments, we compared Subenum’s performance to stateof-the-art algorithms and implementations. Experimental
results show that Subenum provides a parallel, load-balanced,
and effective solution for all-subgraph enumeration problem.
Compared to the fastest available tools for nonisomorphic
subgraph enumeration, Subenum enumerates subgraphs several times faster and is able to reduce execution time from
days to hours. In addition, Sube (...truncated)