Graph Pattern Polynomials

LIPICS - Leibniz International Proceedings in Informatics, Nov 2018

Given a host graph G and a pattern graph H, the induced subgraph isomorphism problem is to decide whether G contains an induced subgraph that is isomorphic to H. We study the time complexity of induced subgraph isomorphism problems when the pattern graph is fixed. Nesetril and Poljak gave an O(n^{k omega}) time algorithm that decides the induced subgraph isomorphism problem for any 3k vertex pattern graph (the universal algorithm), where omega is the matrix multiplication exponent. Improvements are not known for any infinite pattern family. Algorithms faster than the universal algorithm are known only for a finite number of pattern graphs. In this paper, we show that there exists infinitely many pattern graphs for which the induced subgraph isomorphism problem has algorithms faster than the universal algorithm. Our algorithm works by reducing the pattern detection problem into a multilinear term detection problem on special classes of polynomials called graph pattern polynomials. We show that many of the existing algorithms including the universal algorithm can also be described in terms of such a reduction. We formalize this class of algorithms by defining graph pattern polynomial families and defining a notion of reduction between these polynomial families. The reduction also allows us to argue about relative hardness of various graph pattern detection problems within this framework. We show that solving the induced subgraph isomorphism for any pattern graph that contains a k-clique is at least as hard detecting k-cliques. An equivalent theorem is not known in the general case. In the full version of this paper, we obtain new algorithms for P_5 and C_5 that are optimal under reasonable hardness assumptions. We also use this method to derive new combinatorial algorithms - algorithms that do not use fast matrix multiplication - for paths and cycles. We also show why graph homomorphisms play a major role in algorithms for subgraph isomorphism problems. Using this, we show that the arithmetic circuit complexity of the graph homomorphism polynomial for K_k - e (The k-clique with an edge removed) is related to the complexity of many subgraph isomorphism problems. This generalizes and unifies many existing results.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://drops.dagstuhl.de/opus/volltexte/2018/9917/pdf/LIPIcs-FSTTCS-2018-18.pdf

Graph Pattern Polynomials

F S T T C S Graph Pattern Polynomials Karteek Sreenivasaiah 0 1 2 0 Department of Computer Science and Engineering, Indian Institute of Technology Hyderabad , India 1 Balagopal Komarath Saarland University , Saarland Informatics Campus, Saarbrücken , Germany 2 Markus Bläser Department of Computer Science, Saarland University , Saarland Informatics Campus, Saarbrücken , Germany Given a host graph G and a pattern graph H, the induced subgraph isomorphism problem is to decide whether G contains an induced subgraph that is isomorphic to H. We study the time complexity of induced subgraph isomorphism problems when the pattern graph is fixed. Nešetřil and Poljak gave an O(nkω) time algorithm that decides the induced subgraph isomorphism problem for any 3k vertex pattern graph (the universal algorithm), where ω is the matrix multiplication exponent. Improvements are not known for any infinite pattern family. Algorithms faster than the universal algorithm are known only for a finite number of pattern graphs. In this paper, we show that there exists infinitely many pattern graphs for which the induced subgraph isomorphism problem has algorithms faster than the universal algorithm. Our algorithm works by reducing the pattern detection problem into a multilinear term detection problem on special classes of polynomials called graph pattern polynomials. We show that many of the existing algorithms including the universal algorithm can also be described in terms of such a reduction. We formalize this class of algorithms by defining graph pattern polynomial families and defining a notion of reduction between these polynomial families. The reduction also allows us to argue about relative hardness of various graph pattern detection problems within this framework. We show that solving the induced subgraph isomorphism for any pattern graph that contains a k-clique is at least as hard detecting k-cliques. An equivalent theorem is not known in the general case. In the full version of this paper, we obtain new algorithms for P5 and C5 that are optimal under reasonable hardness assumptions. We also use this method to derive new combinatorial algorithms - algorithms that do not use fast matrix multiplication - for paths and cycles. We also show why graph homomorphisms play a major role in algorithms for subgraph isomorphism problems. Using this, we show that the arithmetic circuit complexity of the graph homomorphism polynomial for Kk − e (The k-clique with an edge removed) is related to the complexity of many subgraph isomorphism problems. This generalizes and unifies many existing results. - 2012 ACM Subject Classification Theory of computation → Probabilistic computation, Theory of computation → Problems, reductions and completeness 1 Part of this work was done while the author was at Saarland University, Saarbrücken, Germany. Acknowledgements The authors thank Cornelius Brand and Holger Dell for helpful discussions during the early parts of this work. The authors also thank the anonymous reviewers for comments that helped improve the presentation in the paper. The induced subgraph isomorphism problem asks, given simple and undirected graphs G and H, whether there is an induced subgraph of G that is isomorphic to H. The graph G is called the host graph and the graph H is called the pattern graph. This problem is NP-complete (See [8], problem [GT21]). If the pattern graph H is fixed, there is a simple O(n|V (H)|) time algorithm to decide the induced subgraph isomorphism problem for H. We study the time complexity of the induced subgraph isomorphism problem for fixed pattern graphs on the Word-RAM model. The earliest non-trivial algorithm for this problem was given by Itai and Rodeh [9] who showed that the number of triangles can be computed in O(nω) time on n-vertex graphs, where ω is the exponent of matrix multiplication. Later, Nešetřil and Poljak[11] generalized this algorithm to count K3k in O(nkω) time, where K3k is the clique on 3k vertices. Eisenbrand and Grandoni [3] extended this algorithm further to count K3k+j for j ∈ {0, 1, 2} using rectangular matrix multiplication in O(nω(k+dj/2e,k,k+bj/2c)) time. Here ω(i, j, k) denotes the exponent of the running time of matrix multiplication when multiplying an i × j matrix with a j × k matrix. It is known that detecting/counting any k-vertex pattern is easier than detecting/counting Kk. Therefore, these algorithms are called “universal” algorithms. Algorithms that improve the universal algorithm for specific pattern graphs are only known for small fixed values of k. For example, the induced subgraph isomorphism problem for P4 can be solved in O(n + m) time [1] and all 4-vertex graphs other than K4 can be detected in O(nω) time [14]. In Section 5, we give the first algorithm that detects infinitely many pattern graphs faster than the universal algorithm. Our algorithm works by reducing the induced subgraph isomorphism problem into detecting multilinear terms in a related polynomial. This idea has been previously used by many authors (See [13], [10], [5], and [7] for its application to subgraph isomorphism problems) to solve combinatorial problems efficiently. A major contribution of our work is a general framework that can describe many existing algorithms for subgraph isomorphism problems. We show that graph pattern2 detection problems can be reformulated as the problem of detecting multilinear terms in special classes of polynomials called graph pattern polynomials (Defined in Section 4). We also define a notion of reduction between these polynomials that allows us to argue about the relative hardness of the graph pattern detection problems. It is known that detecting an induced path on 2k vertices is at least as hard as detecting a Kk [6]. Intuitively, 2 Examples of graph patterns include subgraph isomorphisms, induced subgraph isomorphisms, and graph homomorphisms any pattern graph H that contains a Kk (or equivalently, an independent set on k vertices) should be as hard to detect as a Kk. But this is not known. In Section 6, we show that the graph pattern polynomial for Kk can be reduced to the polynomial that corresponds to the induced subgraph isomorphism problem for H for any H that contains a Kk. This shows that if we can obtain better algorithms for H using our framework, then we obtain better algorithms for Kk. We show that all existing algorithms for induced subgraph isomorphism problems can be either described using our framework or we can obtain algorithms with matching running times using our framework. Therefore, these reductions can be viewed as showing the limitations of current methods for solving subgraph isomorphism problems. In Section 7, we discuss the results in the full version of this paper. In Section 3, we show how to use graph pattern polynomials to obtain a linear-time algorithm for detecting paths on four vertices. For a polynomial f , we use deg(f ) to denote the degree of f . A monomial is called multilinear, if every variable in it has degree at most one. We use ML(f ) to denote the multilinear part of f , that is, the sum of all multilinear monomials in f . An arithmetic circuit computing a polynomial P ∈ K[x1, . . . , xn] is a circuit with +, × gates where the input gates are labelled by variables or constants from the underlying field and one gate is designated as the output gate. The size of an arithmetic circuit is the number of wires in the circuit. For indeterminates x1, . . . , xn and a set S = {s1, . . . , sp} ⊆ {1, . . . , n} of indices, we write xS to denote the product xs1 · · · xsp . An induced subgraph isomorphism from H to G is an injective function φ : V (H) i7→nd V (G) such that {u, v} ∈ E(H) ⇐⇒ {φ(u), φ(v)} ∈ E(G). Any function from V (H) to V (G) can be extended to unordered pairs of vertices of H as φ({u, v}) = {φ(u), φ(v)}. A subgraph isomorphism from H to G is an injective function φ : V (H) s7→ub V (G) such that {u, v} ∈ E(H) =⇒ {φ(u), φ(v)} ∈ E(G). Two subgraph isomorphisms or induced subgraph isomorphisms are considered different only if the set of edges in the image are different. A graph homomorphism from H to G is a function φ : V (H) h7 →om V (G) such that {u, v} ∈ E(H) =⇒ {φ(u), φ(v)} ∈ E(G). Unlike isomorphisms, we consider two distinct functions that yield the same set of edges in the image as distinct graph homomorphisms. We define φ(S) = {φ(s) : s ∈ S}. We write H v H0 (H w H0) to specify that H is a subgraph (supergraph) of H0. The number tw(H) stands for the treewidth of H. We denote the number of automorphisms of H by #aut(H). The graph Kn is the complete graph on n vertices labelled using [n]. We use the fact that #aut(H) = 1 for almost all graphs in many of our results. In this paper, we will frequently consider graphs where vertices are labelled by tuples. A vertex (i, p) is said to have label i and colour p. An edge {(i1, p1), (i2, p2)} has label {i1, i2} and colour {p1, p2}. We will sometimes write this edge as ({i1, i2}, {p1, p2}). Note that both {(i1, p1), (i2, p2)} and {(i2, p1), (i1, p2)} are written as ({i1, i2}, {p1, p2}). But the context should make it clear which edge is being rewritten. We will often use the following short forms to denote specific pattern graphs: K` : A clique on ` vertices I` : An independent set on ` vertices K` − e : A K` with an edge removed K` + e : A K` and one more edge on ` + 1 vertices P` : A Path on ` vertices C` : A cycle on ` vertices A Motivating Example: Induced-P4 Isomorphism In this section, we sketch a one-sided error, randomized O(n2) time algorithm for the induced subgraph isomorphism problem for P4 to illustrate the techniques used to derive algorithms in this paper. We start by giving an algorithm for the subgraph isomorphism problem for P4. Consider the following polynomial: NP4,n = where the summation is over all quadruples over [n] where all four elements are distinct. Each of the y variables corresponds to a vertex of a possible P4 and the x variables correspond to the edges. Hence each monomial in the above polynomial corresponds naturally to a P4 on the vertices p, q, r, s chosen in the summation. The condition p < s ensures that each path has exactly one monomial corresponding to it. Given an n-vertex host graph G and an arithmetic circuit for NP4,n, we can construct an arithmetic circuit for the polynomial NP4,n(G) on the y variables obtained by substituting xe = 0 when e 6∈ E(G) and xe = 1 when e ∈ E(G). The polynomial NP4,n(G) can be written as PX aX yX where the summation is over all four vertex subsets X of V (G) and aX is the number of P4s in the induced subgraph G[X]. Therefore, we can decide whether G has a subgraph isomorphic to P4 by testing whether NP4,n(G) is identically 0. Since the degree of this polynomial is a constant k, this can be done in time linear in the size of the arithmetic circuit computing NP4,n. However, we do not know how to construct a O(n2) size arithmetic circuit for NP4,n. Instead, we construct a O(n2) size arithmetic circuit for the following polynomial called the walk polynomial: HomP4,n = e∈E(P4) Similar to NP4,n, the y and x variables correspond to vertices and edges respectively. The z variables play the role of fixing the mapping from P4 to Kn that is chosen in the summation. This polynomial is also called the homomorphism polynomial for P4 because its terms are in one-to-one correspondence with graph homomorphisms from P4 to Kn. As before, we consider the polynomial HomP4,n(G) obtained by substituting for the x variables appropriately. The crucial observation is that HomP4,n(G) contains a multilinear term if and only if NP4,n(G) is not identically zero. This is because the multilinear terms of HomP4,n correspond to injective homomorphisms from P4 which in turn correspond to subgraph isomorphisms from P4. More specifically, each P4 corresponds to two injective homomorphisms from P4 since P4 has two automorphisms. Therefore, we can test whether G has a subgraph isomorphic to P4 by testing whether HomP4,n(G) has a multilinear term. It is known that the polynomial p4 = HomP4,n has O(n2) size circuits using the following inductive construction: u∈[n] p4 = X p4,v The above construction can be extended to construct pk for any k and not just k = 4. This method is used in [13] to obtain an O(2k(n + m)) time algorithm for the subgraph isomorphism problem for Pk. In fact, the above method works for any pattern graph H. Extend the definitions above to define NH,n and HomH,n in the natural fashion. Then, we can test whether an n-vertex graph G has a subgraph isomorphic to H by testing whether NH,n(G) is identically zero which in turn can be done by testing whether HomH,n(G) has a multilinear term. Therefore, the complexity of subgraph isomorphism problem for any pattern H is as easy as constructing the homomophism polynomial for H. This method is used by Fomin et. al. [7] to obtain efficient algorithms for subgraph isomorphism problems. We now turn our attention to the induced subgraph isomorphism problem for P4. We note that the induced subgraph isomorphism problem for Pk is much harder than the subgraph isomorphism problem for Pk. The subgraph isomorphism problem for Pk has a linear time algorithm as seen above but the induced subgraph isomorphism problem for Pk cannot have no(k) time algorithms unless FPT = W[1]. We start by considering the polynomial: IP4,n = ypyqyrysx{p,q}x{q,r}x{r,s}(1 − x{p,r})(1 − x{p,s})(1 − x{q,s}) The polynomial IP4,n(G) can be written as PX yX where the summation is over all four vertex subsets of V (G) that induces a P4. Notice that all coefficents are 1 because there can be at most 1 induced-P4 on any four vertex subset. By expanding terms of the form 1 − x∗ in the above polynomial, we observe that we can rewrite IP4,n as follows: IP4,n = NP4,n − 4NC4,n − 2NK3+e,n + 6NK4−e,n + 12NK4,n Since the coefficients in IP4,n(G) are all 0 or 1, it is sufficient to check whether IP4,n(G) (mod 2) is non-zero to test whether IP4,n(G) is non-zero. From the above equation, we can see that IP4,n = NP4,n (mod 2). Therefore, instead of working with IP4,n (mod 2), we can work with NP4,n (mod 2). We have already seen that we can use HomP4,n(G) to test whether NP4,n(G) is non-zero. However, this is not sufficient to solve induced subgraph isomorphism. We want to detect whether NP4,n(G) is non-zero modulo 2. Therefore, the multilinear terms of HomP4,n(G) has to be in one-to-one correspondence with the terms of NP4,n(G). We have to divide the polynomial HomP4,n(G) by 2 before testing for the existence of multilinear terms modulo 2. However, since we are working over a field of characteristic 2, this division is not possible. We work around this problem by starting with HomP4,n0 for n0 slightly larger than n and we show that this enables the “division” by 2. The reader may have observed that instead of the homomorphism polynomial, we could have taken any polynomial f for which the multilinear terms of f (G) are in one-to-one correspondence with NP4,n(G). This observation leads to the definition of a notion of reduction between polynomials. Informally, f g if detecting multilinear terms in f (G) is as easy as detecting multilinear terms in g(G). Additionally, for the evaluation f (G) to be well-defined, the polynomial f must have some special structure. We call such polynomials graph pattern polynomials. On first glance, it appears hard to generalize this algorithm for P4 to sparse pattern graphs on an arbitrary number of vertices (For example, Pk) because we have to argue about the coefficients of many N∗ polynomials in the expansion. On the other hand, if we consider the pattern graph Kk, we have IKk = HomKk . In this paper, we show that for many graph patterns sparser than Kk, the induced subgraph isomorphism problem is as easy as constructing arithmetic circuits for homomorphism polynomials for those patterns (or patterns that are only slightly denser). Graph pattern polynomial families We will consider polynomial families f = (fn) of the following form: Each fn will be a polynomial in variables y1, . . . , yn, the vertex variables, and variables x1, . . . , x(n), the edge 2 variables, and at most linear in n number of additional variables.The degree of each fn will usually be constant. The (not necessarily induced) subgraph isomorphism polynomial family NH = (NH,n)n≥0 for a fixed pattern graph H on k vertices and ` edges is a family of multilinear polynomials of degree k + `. The nth polynomial in the family, defined over the vertex set [n], is the polynomial on n + n2 variables given by (1): NH,n = When context is clear, we will often omit the subscript n and simply write NH . Given a (host) graph G on n vertices, we can substitute values for the edge variables of NH,n depending on the edges of G (xe = 1 if e ∈ E(G) and xe = 0 otherwise) to obtain a polynomial NH,n(G) on the vertex variables. The monomials of this polynomial are in one-to-one correspondence with the H-subgraphs of G. i.e., a term ayv1 · · · yvk , where a is a positive integer, indicates that there are a subgraphs isomorphic to H in G on the vertices v1, . . . , vk. Therefore, to detect if there is an H-subgraph in G, we only have to test whether NH,n(G) has a multilinear term. The induced subgraph isomorphism polynomial family IH = (IH,n)n≥0 for a pattern graph H over the vertex set [n] is defined in (2). IH,n = X (−1)e(H0)−e(H)#sub(H, H0)NH0,n IH,n = e6∈E(H) If we substitute the edge variables of IH,n using a host graph G on n vertices, then the monomials of the resulting polynomial IH,n(G) on the vertex variables are in one-to-one correspondence with the induced H-subgraphs of G. In particular, all monomials have coefficient 0 or 1 because there can be at most one induced copy of H on a set of k vertices. This implies that to test if there is an induced H-subgraph in G, we only have to test whether IH,n(G) has a multilinear term and we can even do this modulo p for any prime p. Also, note that IH is simply IH where all the edge variables xe are replaced by 1 − xe. The homomorphism polynomial family HomH = (HomH,n)n≥0 for pattern graph H over the vertex set [n] is defined in (3). HomH,n = e∈E(H) The variables za,v’s are called the homomorphism variables. They keep track how the vertices of H are mapped by the different homomorphisms in the summation. We note that the size of the arithmetic circuit computing HomH,n is independent of the labelling chosen to define the homomorphism polynomial. The arithmetic circuit complexity of such homomorphism polynomials, with respect to properties of the pattern graph, has been studied in [4]. The induced subgraph isomorphism polynomial for any graph H and subgraph isomorphism polynomials for supergraphs of H are related as follows: Here e(H) is the number of edges in H and #sub(H, H0) is the number of times H appears as a subgraph in H0. The sum is taken over all supergraphs H0 of H having the same vertex set as H. Equation 4 is used by Curticapean, Dell, and Marx [2] in the context of counting subgraph isomorphisms. For any fixed pattern graph H, the degree of polynomial families NH , IH , and HomH are bounded by a constant depending only on the size of H. Such polynomial families are called constant-degree polynomial families. I Definition 4.1. A constant-degree polynomial family f = (fn) is called a graph pattern polynomial family if the nth polynomial in the family has n vertex variables, n2 edge variables, and at most cn other variables, where c is a constant, and every non-multilinear term of fn has at least one non-edge variable of degree greater than 1. It is easy to verify that IH , NH , and HomH are all graph pattern polynomial families. For a graph pattern polynomial f , we denote by f (G) the polynomial obtained by substituting xe = 0 if e 6∈ E(G) and xe = 1 if e ∈ E(G) for all edge variables xe. Note that for any graph pattern polynomial f , we have ML(f (G)) = ML(f )(G). This is because any non-multilinear term in f has to remain non-multilinear or become 0 after this substitution. I Definition 4.2. 1. A constant degree polynomial family f = (fn) has circuits of size s(n) if there is a sequence of arithmetic circuits (Cn) such that Cn computes fn and has size at most s(n). 2. f has uniform s(n)-size circuits, if on input n, we can construct Cn in time O(s(n)) on a Word-RAM.3 We now define a notion of reducibility among graph pattern polynomials. Informally, if f g, then we detecting whether fn(G) has a multilinear term is as easy as constructing an arithmetic circuit for gn for all n. First, we define a notion of substitution families that preserves the semantic structure of graph pattern polynomials. I Definition 4.3. A substitution family σ = (σn) is a family of mappings σn : {y1, . . . , yn, x1, . . . , x(n), u1, . . . , um(n)} → K[y1, . . . , yn0 , x1, . . . , x(n20), v1, . . . , vr(n)] 2 mapping variables to polynomials such that: 1. σ maps vertex variables to constant-degree monomials containing one or more vertex variables or other variables, and no edge variables. 2. σ maps edge variables to polynomials with constant-size circuits containing at most one edge variable and no vertex variables. 3. σ maps other variables to constant-degree monomials containing no vertex or edge variables and at least one other variable. σn naturally extends to K[y1, . . . , yn, x1, . . . , x(n), u1, . . . , um(n)]. 2 For the reduction to be useful in deriving algorithms, the substitution has to be easily computable. This leads us to the following definition. I Definition 4.4. A substitution family σ = (σn) is constant-time computable if given n and a variable z in the domain of σn, we can compute σn(z) in constant-time on a Word-RAM. (Note that an encoding of any z fits into on cell of memory.) 3 Since we are dealing with fine-grained complexity, we have to be precise with the encoding of the circuit. We assume an encoding such that evaluating the circuit is linear time and substituting for variables with polynomials represented by circuits is constant-time. Finally, we define our notion of reduction. I Definition 4.5. Let f = (fn) and g = (gn) be graph pattern polynomial families. Then f is reducible to g, denoted f g, via a constant time computable substitution family σ = (σn) if for all n there is an m = O(n) and q = O(1) such that 1. σm(ML(gm)) is a graph pattern polynomial and 2. ML(σm(gm)) = v[q]ML(fn). (Recall that v[q] = v1 · · · vq.) For any prime p, we say that f g (mod p) if there exists an f 0 = f (mod p) such that f 0 g. Property 1 of Definition 4.5 and Properties 1 and 3 of Definition 4.3 imply that σm(gm) is a graph pattern polynomial because Properties 1 and 3 of Definition 4.3 ensure that non-multilinear terms remain so after the substitution. It is easy to see that is reflexive via the identity substitution. We can also assume w.l.o.g. that the variables v1, . . . , vq are fresh variables introduced by the substitution family σ. What is the difference between σm(ML(gm)) and ML(σm(gm)) in the Definition 4.5? Every monomial in ML(σm(gm)) also appears in σm(ML(gm)), however, the latter may contain further monomials that are not multilinear. It is easy to see that is reflexive via the identity substitution. It can be shown that is transitive by composing substitutions. We conclude this section by mentioning how to obtain efficient algorithms using . Efficient algorithms are known (See [10]) for detecting multilinear terms of small degree with non-zero coefficient modulo primes. I Theorem 4.6. Let k be any constant and let p be any prime. Given an arithmetic circuit of size s, there is a randomized, one-sided error O(s)-time algorithm to detect whether the polynomial computed by the circuit has a multilinear term of degree at most k with non-zero modulo p coefficient. An important algorithmic consequence of reducibility is stated in Proposition 4.7. This proposition is used to derive algorithms for induced subgraph isomorphism problems in this paper. I Proposition 4.7. Let p be any prime. Let f and g be graph pattern polynomial families. Let s(n) be a polynomially-bounded function. If f g (mod p) and g has size uniform s(n)-size arithmetic circuits, then we can test whether fn(G) has a multilinear term with non-zero coefficient modulo p in O(s(n)) (randomized one-sided error) time for any n-vertex graph G. Pattern graphs easier than cliques In this section, we describe a family H3k of pattern graphs such that the induced subgraph isomorphism problem for H3k has an O(nω(k,k−1,k)) time algorithm when k = 2`, ` ≥ 1. Note that for the currently known best algorithms for fast matrix multiplication, we have ω(k, k − 1, k) < kω. Therefore, these pattern graphs are strictly easier to detect than cliques. The pattern graph H3k is defined on 3k vertices and we consider the canonical labelling of H3k where there is a (3k − 1)-clique on vertices {1, . . . , 3k − 1} and the vertex 3k is adjacent to the vertices {1, . . . , 2k − 1}. I Lemma 5.1. IH3k = NH3k (mod 2) when k = 2`, ` ≥ 1 Proof. We show that the number of times H3k is contained in any of its proper supergraphs is even if k is a power of 2. The graph K3k contains 3k 23kk−−11 copies of H3k. This number is even when k is even. The graph K3k − e contains 2 32kk−−21 copies of H3k. This number is always even. The remaining proper supergraphs of H3k are the graphs K3k−1 + (2k + i)e, i.e., a (3k − 1)-clique with 2k + i edges to a single vertex, for 0 ≤ i < k − 2. There are mi = 22kk−+1i copies of the graph H3k in these supergraphs. We observe that the numbers mi are even when k = 2`, ` ≥ 1 by Lucas’ theorem. Lucas’ theorem states that pq is even if and only if in the binary representation of p and q, there exists some bit position i such that qi = 1 and pi = 0. To see why mi is even, observe that in the binary representation of 2k − 1, all bits 0 through ` are 1 and in the binary representation of 2k + i, 0 ≤ i < k − 2, at least one of those bits is 0. J I Lemma 5.2. NH3k σ(x(u,a),(v,b)) = 0, if a, b ∈ {1, . . . , 2k − 1} and a < b and u > v σ(x(u,a),(v,b)) = 0, if a, b ∈ {2k, . . . , 3k − 1} and a < b and u > v Rule 3 ensures that in any surviving monomial, all vertices have distinct labels. Rule 4 ensures that the vertices coloured 1, . . . , 2k − 1 are in increasing order and Rule 5 ensures that the vertices coloured 2k, . . . , 3k − 1 are in increasing order. Consider an H3k labelled using [n] where the vertices in the (3k − 1)-clique are labelled v1, . . . , v3k−1 and the remaining vertex is labelled v3k which is connected to v1 < . . . < v2k−1. Also, v2k < . . . < v3k−1. We claim that the monomial corresponding to this labelled H3k (say m) is uniquely generated by the monomial m0 = Q1≤i≤3k zi,(vi,i)w in HomH3k . Note that the vertices and edges in the image of the homomorphism is determined by the map i 7→ (vi, i). The monomial w is simply the product of these vertex and edge variables. It is easy to see that this monomial yields the required monomial under the above substitution. The uniqueness is proved as follows: observe that in any monomial m00 in HomH3k that generates m, the vertex coloured 3k must be v3k. This implies that the vertices coloured 1, . . . , 2k − 1 must be the set {v1, . . . , v2k−1}. Rule 4 ensures that vertex coloured i must be vi for 1 ≤ i ≤ 2k − 1. Similarly, the vertices coloured 2k, . . . , 3k − 1 must be the set {v2k, . . . , v3k−1} and Rule 5 ensures that vertex coloured i must be vi for 2k ≤ i ≤ 3k − 1 as well. But then the monomials m0 and m00 are the same. J Proof. Consider H3k labelled as before. We define the sets S1,k,2k,3k−1 = {1, . . . , k, 2k . . . , 3k − 1}, Sk+1,3k−1 = {k + 1, . . . , 3k − 1}, Sk+1,2k−1 = {k + 1, . . . , 2k − 1}, and S1,2k−1 = {1, . . . , 2k −1}. We also define the tuples V1,k = (v1, . . . , vk), V2k,3k−1 = (v2k, . . . , v3k−1), and Vk+1,2k−1 = (vk+1, . . . , v2k−1) for any set vi of 3k − 1 distinct vertex labels. The algorithm also uses the matrices defined below. The dimensions of each matrix are specified as the superscript. All other entries of the matrix are 0. Notice that all entries are constant-sized monomials. AVn1k,×k,nVk2k,3k−1 = i∈S1,k,2k,3k−1 BVn2kk×,3nkk−−11,Vk+1,2k−1 = i∈Sk+1,2k−1 CVnkk+−11,×2kn−k1,V1,k = x{(vi,i)i∈S1,2k−1 } DVn1k,×k,nv3k = z3k,v3k yv3k Evn3×k,nVkk−+11,2k−1 = i∈Sk+1,2k−1 Compute the matrix products ABC and DE. Replace the n2k−1 variables x{(vi,i)i∈S3 } with (DE)V1,k,Vk+1,2k−1 . The required polynomial is then just HomH3k = Consider a homomorphism of H3k defined as φ : i 7→ ui. The monomial corresponding to this homomorphism is uniquely generated as follows. Let U∗ be defined similarly to the tuples V∗. Set vi = ui for i ∈ [k] in the summation and consider the monomial generated by the product AU1,k,U2k,3k−1 BU2k,3k−1,Uk+1,2k−1 CUk+1,2k−1,U1,k after replacing the variable x{(ui,i)i∈S3 } by (DE)U1,k,Uk+1,2k−1 taking the monomial DU1,k,u3k Eu3k,Uk+1,2k−1 from that entry. It is easy to verify that this generates the required monomial. For uniqueness, observe that this is the only way to generate the required product of the homomorphism variables. Computing ABC can be done using O(nω(k,k−1,k)) size circuits. Computing DE can be done using O(nω(k,1,k−1)) size circuits. The top level sum contributes O(nk) gates. This proves the lemma. J We conclude this section by stating our main theorem. I Theorem 5.4. The induced subgraph isomorphism problem for H3k has an O(nω(k,k−1,k)) time algorithm when k = 2`, ` ≥ 1. Lower Bounds for Pattern Graphs with Cliques Since we can obtain algorithms for induced subgraph isomorphism problems that match the known best algorithms using reductions between graph pattern polynomials, we can interpret the reduction f g as evidence that detecting the graph pattern corresponding to g is harder than detecting f . It is known that the induced subgraph isomorphism problem for P2k is harder than that for Kk. In general, one would think that detecting any graph H that contains Kk as a subgraph would be at least as hard as detecting Kk. However, this is known only when H has a Kk that is vertex disjoint from all other Kk in H. The following theorem shows that we can drop this restriction when working with pattern polynomials. I Theorem 6.1. If H contains a k-clique or a k-independent set, then IKk Proof. We will prove the statement when H contains a k-clique. The other part follows because if H contains a k-independent set, then the graph H contains a k-clique and IKk IH IH . Fix a labelling of H where the vertices of a k-clique are labelled using [k] and the remaining vertices are labelled k + 1, . . . , k + `. Consider the polynomial IH over the vertex set ([n] × [k]) ∪ {(n + i, k + i) : 1 ≤ i ≤ `} and apply the following substitution.  x{i1,i2}  σ(x{(i1,p1),(i2,p2)}) = 1 0 if {p1, p2} ∈ E(Kk) and p1 < p2 and i1 < i2 Consider a k-clique on the vertices i1, . . . , ik ∈ [n] on an n-vertex graph where i1 < · · · < ik. The monomial in IKk corresponding to this clique is generated uniquely from the monomial y(i1,1) . . . y(ik,k)Qi y(n+i,k+i)x{(i1,1),(i2,2)}. . . x{(ik−1,k−1),(ik,k)}w in IH , where w is the product of all edge variables corresponding to edges in H but not in Kk. Note that Rules 1 and 2 ensure that in any surviving monomial, the labels and colours of all vertices are distinct and the colours of the edges must be the same as E(H). The product w is determined by i1, . . . , ik. This proves that ML(σ(IH )) = u[k+`]ML(IKk ). It is easy to verify that the substitution satisfies the other properties. J Using reductions between graph pattern polynomial families, it is possible to give evidence for many “natural” relative hardness results. We see these hardness results as showing the limitations of current methods for solving induced subgraph isomorphism problems. Are there patterns other than the pattern in Theorem 5.4 for which we can use homomorphism polynomials of graphs sparser than Kk for solving the induced subgraph isomorphism problem? In the full version of this paper, we show that we can obtain better algorithms for paths and cycles using our method. More specifically, we show that the induced subgraph isomorphism problems for P5 and C5 can be done in O(nω) time which is optimal assuming the optimality of triangle detection. We also show how to speed up algorithms for Pk and Ck when k ≤ 9. An interesting class of algorithms for induced subgraph isomorphism problems are the so called combinatorial algorithms – algorithms that do not use fast matrix multiplication. The best combinatorial algorithm known for k-cliques is the trivial O(nk) time algorithm. Contrary to general algorithms, we know that many patterns have improved combinatorial algorithms. For example, Virginia Williams [12] showed that there is a O(nk−1) time combinatorial algorithm for the induced subgraph isomorphism problem for Kk − e. In fact, we show that ,from existing results, one can obtain combinatorial algorithms running in time O(nk−1) for all patterns except Kk and Ik. Furthermore, for Pk and Ck we show that we can obtain new combinatorial algorithms running in time O(nk−2). In the full version of the paper, we show that the complexity of many pattern detection and counting problems can be linked to the circuit complexity of homomorphism polynomials for Kk − e. We show that if there are O(nf(k)) size circuits for HomKk−e, then: 1. The induced subgraph isomorphism problem for any k-vertex pattern other than Kk, Ik can be solved in O(nf(k)) time. This shows that the induced subgraph isomorphism problem for any k-vertex pattern has a O(nk−1) time combinatorial algorithm. This also shows that when k ≤ 9, all patterns other than Kk, Ik have faster algorithms. 2. The number of subgraphs isomorphic to any k-vertex pattern can be counted in O(nf(k)) time. 3. If we can count the number of induced subgraphs isomorphic to some k-vertex pattern in O(t(n)) time, then we can count all k-vertex patterns in O(nf(k) + t(n)) time. This implies that for k ≤ 9, improved algorithms for counting any k-vertex pattern will improve algorithms for counting k-cliques. We also explain why homomorphism polynomials feature prominently in many results related to subgraph isomorphism. We show that for any pattern H, if there exists a family of polynomials such that NH f , then the size complexity of HomH is at most the size complexity of f . Therefore, in a concrete sense, homomorphism polynomials are the best graph pattern polynomial families for subgraph isomorphism problems. We also use reductions between graph pattern polynomial families similar to Theorem 6.1 to show many lower bounds that seem natural but are not known for general algorithms. 1. For almost all pattern graphs H, the induced subgraph isomorphism problem for H is harder than the subgraph isomorphism problem for H (A randomized reduction is to just randomly delete edges from the graph). 2. For almost all pattern graphs H, the subgraph isomorphism problem for H is easier than subgraph isomorphism problems for any supergraph of H. Note however that we do not know whether these lower bounds imply general algorithmic hardness. But we believe that these results show the limitations of existing methods for solving subgraph isomorphism problems. D. Corneil , Y. Perl , and L. Stewart . A Linear Recognition Algorithm for Cographs . SIAM Journal on Computing , 14 ( 4 ): 926 - 934 , 1985 . doi:10.1137/0214065. Radu Curticapean , Holger Dell , and Dániel Marx . Homomorphisms are a good basis for counting small subgraphs . In Hamed Hatami, Pierre McKenzie , and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , STOC 2017 , Montreal, QC, Canada, June 19 -23, 2017 , pages 210 - 223 . ACM, 2017 . doi:10.1145/ 3055399.3055502. Friedrich Eisenbrand and Fabrizio Grandoni . On the complexity of fixed parameter clique and dominating set . Theoretical Computer Science , 326 ( 1 ): 57 - 67 , 2004 . doi:10.1016/j. tcs. 2004 .05.009. J. Graph Algorithms Appl., 20 ( 1 ): 3 - 22 , 2016 . Peter Floderus , Miroslaw Kowaluk , Andrzej Lingas, and Eva-Marta Lundell. Detecting and Counting Small Pattern Graphs. SIAM J . Discrete Math., 29 ( 3 ): 1322 - 1339 , 2015 . doi:10.1137/140978211. Sci., 605 : 119 - 128 , 2015 . doi:10.1016/j.tcs. 2015 .09.001. Fedor V. Fomin , Daniel Lokshtanov , Venkatesh Raman , Saket Saurabh , and B. V. Raghavendra Rao . Faster algorithms for finding and counting subgraphs . J. Comput. Syst. Sci. , 78 ( 3 ): 698 - 706 , 2012 . doi:10.1016/j.jcss. 2011 .10.001. Michael R. Garey and David S. Johnson . Computers and Intractability; A Guide to the Theory of NP-Completeness . W. H. Freeman & Co., New York, NY, USA, 1990 . Alon Itai and Michael Rodeh . Finding a Minimum Circuit in a Graph . SIAM Journal on Computing , 7 ( 4 ): 413 - 423 , 1978 . doi:10.1137/0207033. Ioannis Koutis and Ryan Williams . LIMITS and applications of group algebras for parameterized problems . ACM Trans. Algorithms , 12 ( 3 ): 31 : 1 - 31 : 18 , 2016 . doi:10.1145/2885499. Commentationes Mathematicae Universitatis Carolinae , 026 ( 2 ): 415 - 419 , 1985 . URL: http: //eudml.org/doc/17394. Virginia Vassilevska . Efficient Algorithms for Path Problems in Weighted Graphs . PhD thesis , School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213 , August 2008 . Ryan Williams . Finding paths of length k in O*(2k) time . Inf. Process. Lett. , 109 ( 6 ): 315 - 318 , 2009 . doi:10.1016/j.ipl. 2008 .11.004. Virginia Vassilevska Williams , Joshua R. Wang , Richard Ryan Williams, and Huacheng Yu . Finding Four-Node Subgraphs in Triangle Time . In Piotr Indyk, editor, Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms , SODA 2015 , San Diego, CA, USA, January 4- 6 , 2015 , pages 1671 - 1680 . SIAM, 2015 . doi:10.1137/1.


This is a preview of a remote PDF: http://drops.dagstuhl.de/opus/volltexte/2018/9917/pdf/LIPIcs-FSTTCS-2018-18.pdf

Markus Bl\"aser, Balagopal Komarath, Karteek Sreenivasaiah. Graph Pattern Polynomials, LIPICS - Leibniz International Proceedings in Informatics, 2018, 18:1-18:13, DOI: 10.4230/LIPIcs.FSTTCS.2018.18