Comparing Graphs via Persistence Distortion
Comparing Graphs via Persistence Distortion∗
Tamal K. Dey, Dayu Shi, and Yusu Wang
Computer Science and Engineering Department, The Ohio State University, USA
tamaldey,shiday,
Abstract
Metric graphs are ubiquitous in science and engineering. For example, many data are drawn
from hidden spaces that are graph-like, such as the cosmic web. A metric graph offers one of the
simplest yet still meaningful ways to represent the non-linear structure hidden behind the data.
In this paper, we propose a new distance between two finite metric graphs, called the persistencedistortion distance, which draws upon a topological idea. This topological perspective along with
the metric space viewpoint provide a new angle to the graph matching problem. Our persistencedistortion distance has two properties not shared by previous methods: First, it is stable against
the perturbations of the input graph metrics. Second, it is a continuous distance measure, in
the sense that it is defined on an alignment of the underlying spaces of input graphs, instead of
merely their nodes. This makes our persistence-distortion distance robust against, for example,
different discretizations of the same underlying graph.
Despite considering the input graphs as continuous spaces, that is, taking all points into
account, we show that we can compute the persistence-distortion distance in polynomial time.
The time complexity for the discrete case where only graph nodes are considered is much faster.
1998 ACM Subject Classification F.2.2 Geometric problems and computations, G.2.2 Graph
algorithms
Keywords and phrases Graph matching, metric graphs, persistence distortion, topological method
Digital Object Identifier 10.4230/LIPIcs.SOCG.2015.491
1
Introduction
Many data in science and engineering are drawn from a hidden space which are graph-like,
such as the cosmic web [28] and road networks [1, 5]. Furthermore, as modern data becomes
increasingly complex, understanding them with a simple yet still meaningful structure
becomes important. Metric graphs equipped with a metric derived from the data can provide
such a simple structure [18, 27]. They are graphs where each edge is associated with a
length inducing the metric of shortest path distance. The comparison of the representative
metric graphs can benefit classification of data, a fundamental task in processing them. This
motivates the study of metric graphs in the context of matching or comparison.
To compare two objects, one needs a notion of distance in the space where the objects
are coming from. Various distance measures for graphs and their metric versions have
been proposed in the literature with associated matching algorithms. We approach this
problem with two new perspectives: (i) We aim to develop a distance measure which is
both meaningful and stable against metric perturbations, and at the same time amenable
to polynomial time computations. (ii) Unlike most previous distance measures which are
discrete in the sense that only graph nodes alignments are considered, we aim for a distance
∗
This work is partially supported by NSF under grants CCF-0747082, CCF-1064416, CCF-1319406,
CCF1318595. See [11] for the full version of this paper.
© Tamal K. Dey, Dayu Shi, and Yusu Wang;
licensed under Creative Commons License CC-BY
31st International Symposium on Computational Geometry (SoCG’15).
Editors: Lars Arge and János Pach; pp. 491–506
Leibniz International Proceedings in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
492
Comparing Graphs via Persistence Distortion
measure that is continuous, that is, alignment for all points in the underlying space of the
metric graphs are considered.
Related work. To date, the large number of proposed graph matching algorithms fall into
two broad categories: exact graph matching methods and inexact graph matching (distances
between graphs) methods.
The exact graph matching, also called the graph isomorphism problem, checks whether
there is a bijection between the node sets of two input graphs that also induces a bijection
in their edge sets. While polynomial time algorithms exist for many special cases, e.g.,
[2, 21, 25], for general graphs, it is not known whether the graph isomorphism problem is NP
complete or not [17]. Nevertheless, given the importance of this problem, there are various
exact graph matching algorithms developed in practice. Usually, these methods employ some
pruning techniques aiming to reduce the search space for identifying graph isomorphisms.
See [15] for comparisons of various graph isomorphism testing methods.
In real world applications, input graphs often suffer from noise and deformation, and
it is highly desirable to obtain a distance between two input graphs beyond the binary
decision of whether they are the same (isomorphic) or not. This is referred to as inexact
graph matching in the field of pattern recognition, and various distance measures have been
proposed. One line of work is based on graph edit distance which is NP-hard to compute [32].
Many heuristic methods, using for example A∗ algorithms, have been proposed to address
the issue of high computational complexity, see the survey [16] and references within. One of
the main challenges in comparing two graphs is to determine how “good” a given alignment
of graph nodes is in terms of the quality of the pairwise relations between those nodes. Hence
matching two graphs naturally leads to an integer quadratic programming problem (IQP),
which is a NP-hard problem. Several heuristic methods have been proposed to approach this
optimization problem, such as the annealing approach of [19], iterative methods of [24, 30]
and probabilistic approach in [31]. Finally, there have been several methods that formulate
the optimization problem based on spectral properties of graphs. For example, in [29], the
author uses the eigendecomposition of adjacency matrices of the input graphs to derive
an expression of an orthogonal matrix which optimizes the objective function. In [9, 23],
the principal eigenvector of a “compatibility” matrix of the input graphs is used to obtain
correspondences between input graph nodes. Recently in [22], Hu et. al proposed the general
and descriptive Laplacian family signatures to build the compatibility matrix and model the
graph matching problem as an integer quadratic program.
New work. Different from previous approaches, we view input graphs as continuous metric
spaces. Intuitively, we assume that our input is a finite graph G = (V, E) where each edge
is assigned a positive length value. We now consider G as a metric space (|G|, dG ) on the
underlying space |G| of G, with metric dG being the shortest path metric in |G|. Given two
metric graphs G1 and G2 , a natural way to measure their distance is to use the so-called
Gromov-Hausdorff distance [20, 26] to measure the metric distortion between these two
metric spaces. Unfortunately, it is NP-hard to even approximate th (...truncated)