Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity

LIPICS - Leibniz International Proceedings in Informatics, Jul 2019

Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. In this paper, we study two fundamental problems, 2-edge connectivity and 2-vertex connectivity (biconnectivity). PRAM algorithms which run in O(log n) time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an n-vertex, m-edge graph of diameter D and bi-diameter D', 1) a O(log D log log_{m/n} n) parallel time 2-edge connectivity algorithm, 2) a O(log D log^2 log_{m/n}n+log D'log log_{m/n}n) parallel time biconnectivity algorithm, where the bi-diameter D' is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be O(n^{delta}) for arbitrary constant delta>0, and the total memory used is linear in the problem size. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [Andoni et al., 2018]. We also show an Omega(log D') conditional lower bound for the biconnectivity problem.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://drops.dagstuhl.de/opus/volltexte/2019/10590/pdf/LIPIcs-ICALP-2019-14.pdf

Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity

I C A L P Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity Peilin Zhong 0 1 2 Category Track A: Algorithms, Complexity and Games 0 Alexandr Andoni Columbia University , New York City, NY , USA 1 Columbia University , New York City, NY , USA 2 Clifford Stein Columbia University , New York City, NY , USA Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model. and phrases parallel algorithms; biconnectivity; 2-edge connectivity; the MPC model - 14:2 1 Introduction The success of modern parallel and distributed systems such as MapReduce [16, 17], Spark [41], Hadoop [39], Dryad [23], together with the need to solve problems on massive data, is driving the development of new algorithms which are more efficient and scalable in these large-scale systems. An important theoretical problem is to develop models which are good abstractions of these computational frameworks. The Massively Parallel Computation (MPC) model [25, 21, 11, 3, 9, 15, 4] captures the capabilities of these computational systems while keeping the description of the model itself simple. In the MPC model, there are machines (processors), each with ?(N ?) local memory, where N denotes the size of the input and ? ? (0, 1). The computation proceeds in rounds, where each machine can perform unlimited local computation in a round and exchange O(N ?) data at the end of the round. The parallel time of an algorithm is measured by the total number of computation-communication rounds. The MPC model is a variant of the Bulk Synchronous Parallel (BSP) model [38]. It is also a more powerful model than the PRAM since any PRAM algorithm can be simulated in the MPC model [25, 21] while some problem can be solved in a faster parallel time in the MPC model. For example, computing the XOR of N bits takes O(1/?) parallel time in the MPC model but needs near-logarithmic parallel time on the most powerful CRCW PRAM [10]. A natural question to ask is: which problems can be solved in faster parallel time in the MPC model than on a PRAM? This question has been studied by a line of recent papers [25, 19, 29, 3, 1, 6, 22, 15, 7, 14, 13, 32, 20]. Most of these results studied the graph problems, which are the usual benchmarks of parallel/distributed models. Many graph problems such as graph connectivity [35, 33, 30], graph biconnectivity [37, 36], maximal matching [26], minimum spanning tree [27] and maximal independent set [31, 2] can be solved in the standard logarithmic time in the PRAM model, but these problems have been shown to have a better parallel time in the MPC model. In addition, we hope to develop fully scalable algorithms for the graph problems, i.e., the algorithm should work for any constant ? > 0. The previous literatures show that a graph problem in the MPC model with large local memory size may be much easier than the same problem in the MPC model but with a smaller local memory size. In particular, when the local memory size per machine is close to the number of vertices n, many graph problems have efficient algorithms. For example, if the local memory size per machine is n/ logO(1) n, the connectivity problem [7] and the approximate matching problem [5] can be solved in O(log log n) parallel time. If the local memory size per machine is ?(n), then the MPC model meets the congested clique model [12]. In this setting, the connectivity problem and the minimum spanning tree problem can be solved in O(1) parallel time [24]. If the local memory size per machine is n1+?(1), many graph problems such as maximal matching, approximate weighted matchings, approximate vertex and edge covers, minimum cuts, and the biconnectivity problem can be solved in O(1) parallel time [29, 8]. The landscape of graph algorithms in the MPC model with small local memory is more nuanced and challenging for algorithm designers. If the local memory size per machine is n1??(1), then the best connectivity algorithm takes parallel time O(log D log log n) where D is the diameter of the graph [4], and the best approximate maximum matching algorithm takes parallel time Oe(?log n) [32]. Therefore, the main open question is: which kind of the graph problems can have faster fully scalable MPC algorithms than the standard logarithmic PRAM algorithms? Two fundamental graph problems in graph theory are 2-edge connectivity and 2-vertex connectivity (biconnectivity). In this work, we studied these two problems in the MPC model. Consider an n-vertex, m-edge undirected graph G. A bridge of G is an edge whose removal increases the number of connected components of G. In the 2-edge connectivity problem, the goal is to find all the bridges of G. For any two different edges e, e0 of G, e, e0 are in the same biconnected component (block) of G if and only if there is a simple cycle which contains both e, e0. If we define a relation R such that eRe0 if and only if e = e0 or e, e0 are contained by a simple cycle, then R is an equivalence relation [18]. Thus, a biconnected component is an induced graph of an equivalence class of R. In the biconnectivity problem, the goal is to output all the biconnected components of G. We proposed faster, fully scalable algorithms for the both 2-edge connectivity problem and the biconnectivity problem by parameterizing the running time as a function of the diameter and the bi-diameter of the graph. The diameter D of G is the largest diameter of its connected components. The definition of bi-diameter is a natural generalization of the definition of diameter. If vertices u, v are in the same biconnected component, then the cycle length of (u, v) is defined as the minimum length of a simple cycle which contains both u and v. The bi-diameter D0 of G is the largest cycle length over all the vertex pairs (u, v) where both u and v are in the same biconnected component. Our main results are 1) a fully scalable O(log D log logm/n n) parallel time 2-edge connectivity algorithm, 2) a fully scalable O(log D log2 logm/n n + log D0 log logm/n n) parallel time biconnectivity algorithm. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [4]. We also show an ?(log D0) conditional lower bound for the biconnectivity problem. 1.1 The Model Our model of computation is the Massively Parallel Computation (MPC) model [25, 21, 11]. Consider two non-negative parameters ? ? 0, ? > 0. In the (?, ?)-MPC model [4], there are p machines (processors) each with local memory size s, where p ? s = ?(N 1+? ), s = ?(N ?) and N denotes the size of the input data. Thus, the space per machine is sublinear in N , and the total space is only an O(N ? ) factor more than the input size. In particular, if ? = 0, the total space available in the system is linear in the input size N . The space size is measured by words each containing ?(log(s ? p)) bits. Before the computation starts, the input data is distributed on ?(N/s) input machines. The computation proceeds in rounds. In each round, each machine can perform local computation on its local data, and send messages to other machines at the end of the round. In a round, the total size of messages sent/received by a machine should be bounded by its local memory size s = ?(N ?). For example, a machine can send s size 1 messages to s machines or send a size s message to 1 machine in a single round. However, it cannot broadcast a size s message to every machine. In the next round, each machine only holds the received messages in its local memory. At the end of the computation, the output data is distributed on the output machines. An algorithm in this model is called a (?, ?)-MPC algorithm. The parallel time of an algorithm is the total number of rounds needed to finish its computation. In this paper, we consider ? an arbitrary constant in (0, 1). 1.2 Our Results Our main results are efficient MPC algorithms for 2-edge connectivity and biconnectivity problems. In our algorithms, one important subroutine is computing the Depth-First-Search (DFS) sequence [4] which is a variant of the Euler tour representation proposed by [37, 36] in 1984. We show how to efficiently compute the DFS sequence in the MPC model with linear total space. Conditioned on the hardness of the connectivity problem in the MPC model, we prove a hardness result on the biconnectivity problem. For 2-edge connectivity and biconnectivity, the input is an undirected graph G = (V, E) with n = |V | vertices and m = |E| edges. N = n + m denotes the size of the representation of G, D denotes the diameter of G, and D0 denotes the bi-diameter of G. We state our results in the following. Biconnectivity. In the biconnectivity problem, we want to find all the biconnected components (blocks) of the input graph G. Since the biconnected components of G define a partition on E, we just need to color each edge, i.e., at the end of the computation, ?e ? E, there is a unique tuple (x, c) with x = e stored on an output machine, where c is called the color of e, such that the edges e1, e2 are in the same biconnected components if and only if they have the same color. I Theorem 1 (Biconnectivity in MPC). For any ? ? [0, 2] and any constant ? ? (0, 1), there is a randomized (?, ?)-MPC algorithm which outputs all the biconnected components of the graph G in O log D ? log2 log( Nlo1g+n?/n) + log D0 ? log log( Nlo1g+n?/n) parallel time. The success probability is at least 0.95. If the algorithm fails, then it returns FAIL. The worst case is when the input graph is sparse and the total space available is linear in the input size, i.e., N = n + m = O(n) and ? = 0. In this case, the parallel running time of our algorithm is O(log D ? log2 log n + log D0 ? log log n). If the graph is slightly denser (m = n1+c for some constant c > 0), or the total space is slightly larger (? > 0 is a constant), then we obtain O(log D + log D0) time. A cut vertex (articulation point) in the graph G is a vertex whose removal increases the number of connected components of G. Since a vertex v is a cut vertex if and only if there are two edges e1, e2 which share the endpoint v and e1, e2 are not in the same biconnected component, our algorithm can also find all the cut vertices of G. 2-Edge connectivity. In the 2-edge connectivity problem, we want to output all the bridges of the input graph G. Since an edge is a bridge if and only if each of its endpoints is either a cut vertex or a vertex with degree 1, the 2-edge connectivity problem should be easier than the biconnectivity problem. We show how to solve 2-edge connectivity in the same parallel time as the algorithm proposed by [4] for solving connectivity. I Theorem 2 (2-Edge connectivity in MPC). For any ? ? [0, 2] and any constant ? ? (0, 1), there is a randomized (?, ?)-MPC algorithm which outputs all the bridges of the graph G in O log D ? log log( Nlo1g+n?/n) parallel time. The success probability is at least 0.97. If the algorithm fails, then it returns FAIL. DFS sequence. A rooted tree with a vertex set V can be represented by n = |V | pairs (v1, par(v1)), (v2, par(v2)), ? ? ? , (vn, par(vn)) where par : V ? V is a set of parent pointers, i.e., for a non-root vertex v, par(v) denotes the parent of v, and for the root vertex v, par(v) = v. We show an algorithm which can compute the DFS sequence (Definition 6) of the rooted tree in the MPC model with linear total space. I Theorem 3 (DFS sequence of a tree in MPC). Given a rooted tree represented by a set of parent pointers par : V ? V , there is a randomized (0, ?)-MPC algorithm which outputs the DFS sequence in O(log D) parallel time, where ? ? (0, 1) is an arbitrary constant, D is the depth of the tree. The success probability is at least 0.99. If the algorithm fails, then it returns FAIL. Conditional hardness for biconnectivity. A conjectured hardness for the connectivity problem is the one cycle vs. two cycles conjecture: for any ? ? 0 and any constant ? ? (0, 1), any (?, ?)-MPC algorithm requires ?(log n) parallel time to determine whether the input n-vertex graph is a single cycle or contains two disjoint length n/2 cycles. This conjectured hardness result is widely used in the MPC literature [25, 11, 28, 34, 40]. Under this conjecture, we show that ?(log D0) parallel time is necessary for the biconnectivity problem, and this is true even when D = O(1), i.e., the diameter of the graph is a constant. I Theorem 4 (Hardness of biconnectivity in MPC). For any ? ? 0 and any constant ? ? (0, 1), unless there is a (?, ?)-MPC algorithm which can distinguish the following two instances: 1) a single cycle with n vertices, 2) two disjoint cycles each contains n/2 vertices, in o(log n) parallel time, any (?, ?)-MPC algorithm requires ?(log D0) parallel time for testing whether a graph G with a constant diameter is biconnected. 1.3 Our Techniques Biconnectivity. At a high level our biconnectivity algorithm is based on a framework proposed by [36]. The main idea is to construct a new graph and reduce the problem of finding biconnected components of G to the problem of finding connected components of the new graph G0. At first glance, it should be efficiently solved by the connectivity algorithm [4]. However, there are two main issues: 1) since the parallel time of the MPC connectivity algorithm of [4] depends on the diameter of the input graph, we need to make the diameter of G0 small, 2) we need to construct G0 efficiently. Let us first consider the first issue, and we will discuss the second issue later. We give an analysis of the diameter of G0 = (V 0, E0) constructed by [36]. Without loss of generality, we can suppose the input G = (V, E) is connected. Each vertex in G0 corresponds to an edge of G. Let T be an arbitrary spanning tree of G with depth d. Each non-tree edge e can define a simple cycle Ce which contains the edge e and the unique path between the endpoints of e in the tree T . Thus, the length of Ce is at most 2d + 1. If there is a such cycle containing any two tree edges (u, v), (v, w), vertices (u, v), (v, w) are connected in G0. For each non-tree edge e, we connect the vertex e to the vertex e0 in graph G0 where e0 is an arbitrary tree edge in the cycle Ce. By the construction of G0, any e, e0 from the same connected components of G0 should be in the same biconnected components of G. Now consider arbitrary two edges e, e0 in the same biconnected component of G. There must be a simple cycle C which contains both edges e, e0 in G. Since all the simple cycles defined by the non-tree edges are a cycle basis of G [18], the edge set of C can be represented by the xor sum of all the edge sets of k basis cycles C1, C2, ? ? ? , Ck where Ci is a simple cycle defined by a non-tree edge ei on the cycle C. k is upper bounded by the bi-diameter of G. Furthermore, we can assume Ci intersects Ci+1. There should be a path between e, e0 in G0, and the length of the path is at most Pk i=1 |Ci| ? O(k ? d). So, the diameter of G0 is upper bounded by O(k ? d). Thus, according to [4], we can find the connected components of G0 in ? (log k + log d) parallel time, where d and k are upper bounded by the diameter and the bi-diameter of G respectively. Now let us consider how to construct G0 efficiently. The bottleneck is to determine whether the tree edges (u, v), (v, w) should be connected in G0 or not. Suppose w is the parent of v and v is the parent of u. The vertex (u, v) should connect to the vertex (v, w) in G0 if and only if there is a non-tree edge that connects a vertex x in the subtree of u and a vertex y which is on the outside of the subtree of v. For each vertex x, let lev(x) be the minimum depth of the least common ancestor (LCA) of (x, y) over all the non-tree edges (x, y). Then (u, v) should be connected to (v, w) in G0 if and only if there is a vertex x in the subtree of u in G such that lev(x) is smaller than the depth of v. Since the vertices in a subtree should appear consecutively in the DFS sequence, this question can be solved by some range queries over the DFS sequence. Next, we will discuss how to compute the DFS sequence of a tree. DFS sequence. The DFS sequence of a tree is a variant of the Euler tour representation of the tree. For an n-vertex tree T , [36] gives an O(log n) parallel time PRAM algorithm for the Euler tour representation of T . However, since their construction method will destroy the tree structure, it is hard to get a faster MPC algorithm based on this framework. Instead, we follow the leaf sampling framework proposed by [4]. Although the DFS sequence algorithm proposed by [4] takes O(log d) time where d is the depth of T , it needs ?(n log d) total space. The bottleneck is the subroutine which needs to solve the least common ancestors problem and generate multiple path sequences. The previous algorithm uses the doubling algorithm for the subroutine, i,e., for each vertex v, they store the 2i-th ancestor of v for every i ? [dlog de]. This is the reason why [4] cannot achieve the linear total space. We show how to compress the tree T into a new tree T 0 which only contains at most n/dlog de vertices. We argue that applying the doubling algorithm on T 0 is sufficient for us to find the DFS sequence of T . 2-Edge connectivity. Without loss of generality, we can assume the input graph G is connected. Consider a rooted spanning tree T and an edge e = (u, v) in G. Suppose the depth of u is at least the depth of v in T , i.e., v cannot be a child of u. The edge e is not a bridge if and only if either e is a non-tree edge or there is a non-tree edge (x, y) connecting the subtree of u and a vertex on the outside of the subtree of u. Similarly, the second case can be solved by some range queries over the DFS sequence of T . Conditional hardness for biconnectivity. We want to reduce the connectivity problem to the biconnectivity problem. For an undirected graph G, if we add an additional vertex v? and connects v? to every vertex of G, then the diameter of the resulting graph G0 is at most 2 and each biconnected components of G0 corresponds to a connected component of G. Furthermore, the bi-diameter of G0 is upper bounded by the diameter of G plus 2. Therefore, if the parallel time of an algorithm A0 for finding the biconnected components of G0 depends on the bi-diameter of G0, there exists an algorithm A which can find all the connected components of G in the parallel time which has the same dependence on the diameter of G. 1.4 A Roadmap Section 2 introduces the notation and some useful definitions. Section 3 describes the offline algorithms for 2-edge connectivity and biconnectivity. It also includes some crucial properties of the algorithms. In Section 4, we show an linear space offline algorithm to find the DFS sequence of a tree. All of these offline algorithms can be implemented in the MPC model efficiently. Section 5 contains the conditional hardness result for the biconnectivity problem in the MPC model. For the MPC implementations and all the missing technical proofs, we refer readers to the full version of the paper. 2 2.1 Preliminaries Notation We follow the notation of [4]. [n] denotes the set of integers {1, 2, ? ? ? , n}. defined as maxu,v?V :cyclenG(u,v)6=? cyclenG(u, v). Diameter and bi-diameter. Consider an undirected graph G with a vertex set V and an edge set E. For any two vertices u, v, we use distG(u, v) to denote the distance between u and v in graph G. If u, v are not in the same (connected) component of G, then distG(u, v) = ?. The diameter diam(G) of G is the largest diameter of its connected components, i.e., diam(G) = maxu,v?V :distG(u,v)6=? distG(u, v). (v1, v2, ? ? ? , vk) ? V k is a cycle of length k ? 1 if v1 = vk and ?i ? [k ? 1], (vi, vi+1) ? E. We say a cycle (v1, v2, ? ? ? , vk) is simple if k ? 4 and each vertex only appears once in the cycle except v1 (vk). Consider two different vertices u, v ? V . We use cyclenG(u, v) to denote the minimum length of a simple cycle which contains both vertices u and v. If there is no simple cycle which contains both u and v, cyclenG(u, v) = ?. cyclenG(u, u) is defined as 0. The bi-diameter of G, bi-diam(G), is Representation of a rooted forest. Let V denote a set of vertices. We represent a rooted forest in the same manner as [4]. Consider a mapping par : V ? V . For i ? N>0 and v ? V , we define par(i)(v) as par(par(i?1)(v)), and par(0)(v) is defined as v itself. If ?v ? V, ?i > 0 such that par(i)(v) = par(i+1)(v), then we call par a set of parent pointers on V . For v ? V , if par(v) = v, then we say v is a root of par. Notice that par actually can represent a rooted forest, thus par can have more than one root. The depth of v ? V , deppar(v) is the smallest i ? N such that par(i)(v) is the same as par(i+1)(v). The root of v ? V , par(?)(v) is defined as par(deppar(v))(v). The depth of par, dep(par) is defined as maxv?V deppar(v). Ancestor and path. For two vertices u, v ? V , if ?i ? N such that u = par(i)(v), then u is an ancestor of v (in par). If u is an ancestor of v, then the path P (v, u) (in par) from v to u is a sequence (v, par(v), par(2)(v), ? ? ? , u) and the path P (u, v) is the reverse of P (v, u), i.e., P (u, v) = (u, ? ? ? , par(2)(v), par(v), v). If an ancestor u of v is also an ancestor of w, then u is a common ancestor of (v, w). Furthermore, if a common ancestor u of (v, w) satisfies deppar(u) ? deppar(x) for any common ancestor x of (v, w), then u is the lowest common ancestor (LCA) of (v, w). Children and leaves. For any non-root vertex u of par, u is a child of par(u). For any vertex v ? V , childpar(v) denotes the set of all the children of v, i.e., childpar(v) = {u ? V | u 6= v, par(u) = v}. If u is the kth smallest vertex in the set childpar(v), then we define rankpar(u) = k, or in other words, u is the kth child of v. If v is a root vertex of par, then rankpar(v) is defined as 1. childpar(v, k) denotes the kth child of v. For simplicity, if par is clear in the context, we just use child(v), rank(v) and child(v, k) to denote childpar(v), rankpar(v) and childpar(v, k) for short. If child(v) = ?, then v is a leaf of par. We denote leaves(par) as the set of all the leaves of par, i.e., leaves(par) = {v | child(v) = ?}. 2.2 Depth-First-Search Sequence The Euler tour representation of a tree is proposed by [37, 36]. It is a crucial building block in many graph algorithms including biconnectivity algorithms. The Depth-First-Search (DFS) sequence [4] of a rooted tree is a variant of the Euler tour representation. Let us first introduce some relevant concepts of the DFS sequence. I Definition 5 (Subtree [4]). Consider a set of parent pointers par : V ? V on a vertex set V . Let v be a vertex in V , and let V 0 = {u ? V | v is an ancestor of u}. par0 : V 0 ? V 0 is a set of parent pointers on V 0. If ?u ? V 0 \ {v}, par0(u) = par(u) and par0(v) = v, then par0 is a subtree of v in par. For u ? V 0, we say u is in the subtree of v. The definition of the DFS sequence is the following: I Definition 6 (DFS sequence [4]). Consider a set of parent pointers par : V ? V on a vertex set V . Let v be a vertex in V . If v is a leaf in par, then the DFS sequence of the subtree of v is (v). Otherwise, the DFS sequence of the subtree of v is defined recursively as (v, a1,1, a1,2, ? ? ? , a1,n1 , v, a2,1, a2,2, ? ? ? , a2,n2 , v, ? ? ? , ak,1, ak,2, ? ? ? , ak,nk , v), where k = | child(v)| and ?i ? [k], (ai,1, ai,2, ? ? ? , ai,ni ) is the DFS sequence of the subtree of child(v, i), i.e., the ith child of v. If par : V ? V has a unique root v, then we define the DFS sequence of par as the DFS sequence of the subtree of v. By the definition of the DFS sequence, for any two consecutive elements ai and ai+1 in the sequence, ai is either a parent of ai+1 or ai is a child of ai+1. Furthermore, for any vertex v, if both elements ai and aj (i < j) in the DFS sequence A are v, any element ak between ai and aj (i.e., i ? k ? j) should be a vertex in the subtree of v. 3 2-Edge Connectivity and Biconnectivity Consider a connected undirected graph G with a vertex set V and an edge set E. In the 2-edge connectivity problem, the goal is to find all the bridges of G, where an edge e ? E is called a bridge if its removal disconnects G. In the biconnectivity problem, the goal is to partition the edges into several groups E1, E2, ? ? ? , Ek, i.e., E = Sik=1 Ei, ?i 6= j, Ei ? Ej = ?, such that ?e 6= e0 ? E, e and e0 are in the same group if and only if there is a simple cycle in G which contains both e and e0. A subgraph induced by an edge group Ei is called a biconnected component (block). In other words, the goal of the biconnectivity problem is to find all the blocks of G. In this section, we describe the algorithms for both the 2-edge connectivity problem and the biconnectivity problem in the offline setting. 3.1 2-Edge Connectivity The 2-edge connectivity problem is much simpler than the biconnectivity problem. We first compute a spanning tree of the graph. Only a tree edge can be a bridge. Then for any non-root vertex v, if there is no non-tree edge which crosses between the subtree of v and the outside of the subtree of v, then the tree edge which connects v to its parent is a bridge. I Lemma 7 (2-Edge connectivity). Consider an undirected graph G = (V, E). Let B be the output of Bridges(G). Then B is the set of all the bridges of G. 3.2 Biconnectivity In this section, we will show a biconnectivity algorithm. It is a modification of the algorithm proposed by [36]. The high level idea is to construct a new graph G0 based on the input graph G, and reduce the biconnectivity problem of G to the connectivity problem of G0. Since the running time of the connectivity algorithm [4] depends on the diameter of the graph, we also give an analysis of the diameter of the graph G0. Algorithm 1 2-Edge Connectivity Algorithm. Input: A connected undirected graph G = (V, E). Output: A subset of edges B ? E. Finding bridges (Bridges(G = (V, E)) ): 1. Compute a rooted spanning tree of G. The spanning tree is represented by a set of parent pointers par : V ? V . 2. Compute lev : V ? Z?0: for each v ? V, lev(v) ? min deppar(v), min w?V \{par(v)}:(v,w)?E deppar(the LCA of (v, w)) . 3. Compute the DFS sequence A of par. 4. Initialize B ? ?. For each non-root vertex v, let ai, aj be the first and the last appearance of v in A respectively. If mink:i?k?j lev(ak) ? deppar(v), B ? B ? {(v, par(v))}. Output B. Algorithm 2 Biconnectivity Algorithm. Input: A connected undirected graph G = (V, E). Output: A coloring col : E ? V of the edges. Finding blocks (Biconn(G = (V, E)) ): 1. Compute a rooted spanning tree of G. The spanning tree is represented by a set of parent pointers par : V ? V . 2. Compute lev : V ? Z?0: for each v ? V, lev(v) ? min deppar(v), min w?V \{par(v)}:(v,w)?E deppar(the LCA of (v, w)) . 3. Compute the DFS sequence A of par. 4. Let r be the root of par. Initialize V 0 ? V \ {r}, E0 ? ?. 5. For each v ? V 0, let ai, aj be the first and the last appearance of v in A respectively. If mink?{i,i+1,??? ,j} lev(ak) < deppar(par(v)), E0 ? E0 ? {(v, par(v))}. 6. For each (u, v) ? E, if neither u nor v is the LCA of (u, v) in par, E0 ? E0 ? {(u, v)}. 7. Compute the connected components of G0 = (V 0, E0). Let col0 : V 0 ? V 0 be the coloring of the vertices in V 0 such that ?u0, v0 ? V 0, u0, v0 are in the same connected component in G0 ? col0(u0) = col0(v0). 8. Initialize col : E ? V . For each e = (u, v) ? E, if deppar(u) ? deppar(v), set col(e) ? col0(u); otherwise, set col(e) ? col0(v). Output col : E ? V . I Lemma 8 (Biconnectivity). Consider an undirected graph G = (V, E). Let col : E ? V be the output of Biconn(G). Then ?e, e0 ? E, e 6= e0, col satisfies col(e) = col(e0) ? there is a simple cycle in G which contains both e and e0. Furthermore, the diameter of the graph G0 constructed by Biconn(G) is at most O(dep(par) ? bi-diam(G)), the number of vertices of G0 is at most |V |, and the number of edges of G0 is at most |E|. Algorithm 3 Leaf Sampling Algorithm for DFS Sequence. Pre-determined: A threshold value s. //s will be the local memory size in the MPC model. Input: A rooted tree represented by a set of parent pointers par : V ? V on a set V of n vertices (i.e., par has a unique root r). Output: The DFS sequence of the rooted tree represented by par. Leaf sampling algorithm (LeafSampling(s, par : V ? V ) ): 1. If n ? s, return the DFS sequence of par directly. 2. Set t ? ?(s1/3 log n), L ? leaves(par). 3. Each v ? L is independently chosen with probability p = min(1, t/|L|), and let S = {l1, l2, ? ? ? , lk} be the set of samples. If |S|2 > s, output FAIL. 4. For every pair of sampled leaves x, y ? S with x 6= y, find the least common ancestor px,y of (x, y), and set pxy,x, pxy,y to be two children of px,y such that pxy,x is an ancestor of x and pxy,y is an ancestor of y. 5. Sort l1, l2, ? ? ? , lk ? S such that ?i < j ? [k], rank(plilj,li ) < rank(plilj,lj ). 6. Find the paths A01 = P (r, l1), A02 = P (par(l1), pl1,l2 ), A03 = P (pl1l2,l2 , l2), ? ? ? , A02k?2 = P (par(lk?1), plk?1,lk ), A02k?1 = P (plk?1lk,lk , lk), A02k = P (l2k, r), i.e., the paths: r ? l1 ? the LCA of (l1, l2) ? l2 ? ? ? ? ? lk?1 ? the LCA of (lk?1, lk) ? lk ? r. 7. Set A0 ? A01A02 ? ? ? A02k, i.e., A0 is the concatenation of A01, A02, ? ? ? , A02k. 8. For each element a0 in the ith (i > 1) position of the sequence A0, i if the vertex a0i is a leaf, keep a0i as a single copy; Otherwise, ? if a0i?1 = par(a0i), i.e., i is the first position that the vertex a0i appears in A0, split a0i into rank(a0i+1) copies; //a0i+1 is a child of a0i. ? if a0i?1, a0i+1 ? child(a0i), split a0i into rank(a0i+1) ? rank(a0i?1) copies; ? if a0i+1 = par(a0i), i.e., i is the last position that the vertex a0i appears in A0, split a0i into | child(a0i)| ? rank(a0i?1) copies. //a0i?1 is a child of a0i. Let A00 be the result sequence. 9. For each v ? V , if par(v) appears in A00 but v does not appear in A00, recursively find the DFS sequence of the subtree of v, and insert the such sequence into the position after the rank(v)th appearance of par(v) in A00. Output the final result sequence A. 4 An Offline DFS Sequence Algorithm in Linear Space In Section 4.1, we will review an algorithmic framework proposed by [4] for the DFS sequence. In Section 4.2, 4.3, 4.4, we will discuss the subroutines needed for our DFS sequence algorithm in the offline setting. 4.1 DFS Sequence via Leaf Sampling In the following, we review the leaf sampling algorithmic framework proposed by [4] for finding the DFS sequence of a rooted tree. I Theorem 9 (Leaf sampling algorithm [4]). Consider a set of parent pointers par : V ? V on a set V of n vertices. Suppose par has a unique root. For any ? ? 0 and any constant ? ? (0, 1), if both of step 4 and step 6 in LeafSampling(n?, par) can be implemented in the (?, ?)-MPC model with O(log(dep(par))) parallel time, then the leaf sampling algorithm with parameter s = n? on input par : V ? V can be implemented in the (?, ?)-MPC model. Furthermore, with probability at least 0.99, LeafSampling(n?, par) can output the DFS sequence of par in O(log(dep(par))) parallel time. If the algorithm fails, then it returns FAIL. By Theorem 9, we only need to give a linear total space MPC algorithm for the LCA problem and the path generation problem to design an efficient DFS sequence algorithm in the (0, ?)-MPC model. In [4], they proposed to use doubling algorithms to compute the LCA and generate the paths. Since they need to store the every 2i-th ancestor for each vertex, the total space needed is ?(n ? log(the depth of the tree)). We show that we only need to apply the doubling algorithm for a compressed tree, instead of applying it for the original tree. Algorithm 4 Construction of a Compressed Rooted Tree. Input: A rooted tree represented by a set of parent pointers par : V ? V on a set V of n vertices (par has a unique root r). Output: A vertex set V 0 ? V , a set of parent pointers par0 : V 0 ? V 0 on V 0. Tree compression (Compress(par : V ? V ) ): 1. Compute the depth of par, the depth of each vertex and set d ? dep(par), t ? dlog de. 2. V 0 ? {v ? V | deppar(v) mod t = 0, deppar(v) + t ? d}. 3. Initialize par0 : V 0 ? V 0. For each v ? V 0, par0(v) ? par(t)(v). 4. Output V 0, par0. 4.2 Compressed Rooted Tree Given a set of parent pointers par : V ? V , we will show how to compress the rooted tree represented by par. I Lemma 10 (Properties of a compressed rooted tree). Let par : V ? V be a set of parent pointers on a vertex set V with |V | > 1, and par has a unique root. Let t = dlog(dep(par))e and let (V 0, par0) =Compress(par). Then it has the following properties: 1. |V 0| ? |V |/ log(dep(par)). 2. ?v ? V 0, i ? N, par0(i)(v) = par(i?t)(v) ? V 0. 3. ?v ? V, ?i ? {0, 1, ? ? ? , 2t}, such that par(i)(v) ? V 0. 4.3 Least Common Ancestor Given a rooted tree represented by a set of parent pointers par : V ? V on a vertex set V , and a set of q queries Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} where ?i ? [q], ui 6= vi, ui, vi ? leaves(par), we show a space efficient algorithm which can output the LCA of each queried Algorithm 5 Lowest Common Ancestor. Input: A rooted tree represented by a set of parent pointers par : V ? V on a set V of n vertices (par has a unique root r), and a set of q queries Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} where ?i ? [q], ui 6= vi, ui, vi ? leaves(par). Output: lca : Q ? V ? V ? V . Finding LCA (LCA(par : V ? V, Q) ): 1. (V 0, par0) ?Compress(par). //(see Lemma 10). 2. Set d ? dep(par), t ? dlog de and compute mappings g0, g1, ? ? ? gt : V 0 ? V 0 such that ?v ? V 0, j ? {0, 1, ? ? ? , t}, gj (v) = par0(2j)(v). 3. For each query (ui, vi) ? Q: //Suppose deppar(ui) ? deppar(vi). a. If deppar(ui) > deppar(vi)+2t, find an ancestor ubi of ui in par such that deppar(ubi) ? deppar(vi) + 2t and deppar(ubi) ? deppar(vi). Otherwise, ui ? ui. b b. If ?j ? [4t] par(j)(ui) is the LCA of (ui, vi) in par, set lca(ui, vi) = (par(j)(ui), x, y) b b b where x, y are children of par(j)(ui) and x, y are ancestors of ui, vi respectively. The b b query of (ui, vi) is finished. c. Find an ancestor u0i of ubi in par such that u0i is the closest vertex to ubi in V 0, i.e., deppar(ubi) ? deppar(u0i) is minimized. Similarly, find an ancestor vi0 of vi in par such that v0 is the closest vertex to vi in V 0, i.e., deppar(vi) ? deppar(vi0) is minimized. i d. Find u0i0 6= v00 ? V 0 such that they are ancestors of u0i and vi0 respectively, and i par0(u0i0) = par0(vi00) is the LCA of (u0i, vi0) in par0. e. Find the smallest j ? [2t] such that par(j)(u0i0) = par(j)(vi00). Set lca(ui, vi) = (par(j)(u0i0), par(j?1)(u0i0), par(j?1)(vi00)). pair of vertices. Notice that the assumption that queries only contain leaves is without loss of generality: we can attach an additional child vertex v to each non-leaf vertex u. Thus, v is a leaf vertex. When a query contains u, we can use v to replace u in the query, and the result will not change. Before we analyze the algorithm LCA(par, Q), let us discuss some details of the algorithm. 1. We pre-compute deppar(v) and deppar0 (u) for every v ? V and u ? V 0. 2. To implement step 3a, we firstly check whether deppar(ui) > deppar(vi) + 2t. If it is not true, we can set ui to be ui directly. Otherwise, according to Lemma 10, there b is a j ? {0, 1, ? ? ? , 2t} such that par(j)(ui) ? V 0. Since deppar(ui) > deppar(vi) + 2t, deppar(par(j)(ui)) > deppar(vi). We initialize ui to be par(j)(ui) ? V 0. For k = t ? 0, if b deppar(gk(ubi)) > deppar(vi) (i.e., deppar(par0(2k)(ui)) > deppar(vi)), we set ubi ? gk(ubi) = b par0(2k)(ui). Due to Lemma 10 again, the final ui must satisfy deppar(ubi) ? deppar(vi) b b and deppar(ubi) ? deppar(vi) + 2t. This step takes time O(t). I Lemma 11 (LCA algorithm). Let par : V ? V be a set of parent pointers on a vertex set V . par has a unique root. Let Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} be a set of q pairs of vertices where ?i ? [q], ui 6= vi, ui, vi ? leaves(par). Let lca : Q ? V ? V ? V be the output of LCA(par, Q). For (ui, vi) ? Q, (pi, pi,ui , pi,vi ) = lca(ui, vi) satisfies that pi is the LCA of (ui, vi), pi,ui , pi,vi are ancestors of ui, vi respectively, and pi,ui , pi,vi are children of pi. Furthermore, the space used by the algorithm is at most O(|Q| + |V |). 4.4 Multi-Paths Generation Consider a rooted tree represented by a set of parent pointers par : V ? V on a vertex set V and a set of q vertex-ancestor pairs Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} where ?i ? [q], vi is an ancestor of ui. We show a space efficient algorithm MultiPaths(par, Q) which can generate all the paths P (u1, v1), P (u2, v2), ? ? ? , P (uq, vq). Algorithm 6 Multi-Paths Generation. Input: A rooted tree represented by a set of parent pointers par : V ? V on a set V of n vertices (par has a unique root r), and a set of q vertex-ancestor pairs Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} where ?i ? [q], vi is an ancestor of ui. Output: P1, P2, ? ? ? , Pq. Generating multiple path sequences (MultiPaths(par : V ? V, Q) ): 1. (V 0, par0) ?Compress(par). //(see Lemma 10). 2. Set d ? dep(par), t ? dlog de and compute mappings g0, g1, ? ? ? gt : V 0 ? V 0 such that ?v ? V 0, j ? {0, 1, ? ? ? , t}, gj (v) = par0(2j)(v). 3. For each vertex-ancestor pair (ui, vi) ? Q: a. If deppar(ui) ? deppar(vi) ? 2t, generate the path sequence Pi = (ui, par(1)(ui), par(2)(ui), ? ? ? , vi) directly. b. Otherwise, find the minimum j ? [2t] such that par(j)(ui) ? V 0. Set u0i ? par(j)(ui). Find an ancestor vi0 of u0i in par0 such that deppar(vi0) ? deppar(vi) and deppar(vi0) ? 2t ? deppar(vi). c. Generate the path P 0(u0i, vi0) in par0. d. Initialize a sequence A as the concatenation of (ui), P 0(u0i, vi0) and (vi). e. Repeat: for each element ai in A, if ai is not the last element and ai+1 6= par(ai), insert par(ai) between ai and ai+1; until A does not change. Output the final sequence A as the path sequence Pi. Before we analyze the correctness of the algorithm, let us discuss some details. 1. In step 3a, if the length of the path is at most 2t, then we can generate the path in O(t) rounds. In the j-th round, we can find the vertex par(j)(ui) = par(par(j?1)(ui)). 2. In step 3b, we want to find v0. We initialize vi0 as u0i. For k = t ? 0, if deppar(gk(vi0)) > i deppar(vi) (i.e., deppar(par0(2k)(vi0)) > deppar(vi)), we set vi0 ? gk(vi0) = par0(2k)(vi0). I Lemma 12 (Generation of multiple paths). Let par : V ? V be a set of parent pointers on a vertex set V . par has a unique root. Let Q = {(u1, v1), (u2, v2), ? ? ? , (uq, vq)} ? V ? V be a set of pairs of vertices where ?j ? [q], vj is an ancestor of uj in par. Let P1, P2, ? ? ? , Pq be the output of MultiPaths(par, Q). Then ?j ? [q], Pj = P (uj , vj ), i.e., Pj is a sequence which denotes a path from uj to vj in par. Furthermore, the space used by the algorithm is at most O(|V | + Pj?[q] |Pj |). 5 Hardness of Biconnectivity in MPC There is a conjectured hardness which is widely used in the MPC literature [25, 11, 28, 34, 40]. B Conjecture 1 (1-cycle vs. 2-cycles). For any ? ? 0 and any constant ? ? (0, 1), distinguishing the following two instances in the (?, ?)-MPC model requires ?(log n) parallel time: 1. a single cycle contains n vertices, 2. two disjoint cycles, each contains n/2 vertices. Under the above conjecture, we show that ?(log bi-diam(G)) parallel time is necessary to compute the biconnected components of G. This claim is true even for the constant diameter graph G, i.e., diam(G) = O(1). I Theorem 13 (Hardness of biconnectivity in MPC). For any ? ? 0 and any constant ? ? (0, 1), unless the one cycle vs. two cycles conjecture (Conjecture 1) is false, any (?, ?)MPC algorithm requires ?(log bi-diam(G)) parallel time for testing whether a graph G with a constant diameter is biconnected. Proof. For ? ? 0 and an arbitrary constant ? ? (0, 1), suppose there is a (?, ?)-MPC algorithm A which can determine whether an arbitrary constant diameter graph G is biconnected in o(log bi-diam(G)) parallel time. Then we give a (?, ?)-MPC algorithm for solving one cycle vs. two cycles problem as the following: 1. For a one cycle vs. two cycles instance n-vertex graph G0 = (V 0, E0), construct a new graph G = (V, E): V = V 0 ? {v?}, E = E0 ? {(v, v?) | v ? V 0}. 2. Run A on G. If G is not biconnected, G0 has two cycles. Otherwise G0 is a single cycle. It is easy to see that the diameter of G is 2. If G0 is a single cycle, then G is biconnected and bi-diam(G) = ?(n). If G0 contains two cycles, then G contains two biconnected components and bi-diam(G) = ?(n). The first step of the above algorithm takes O(1) parallel time and only requires linear total space. The graph G has n + 1 vertices and 2n edges. Thus, the above algorithm is also a (?, ?)-MPC algorithm. The parallel time of the above algorithm is the same as the time needed for running A on G which is o(log bi-diam(G)) = o(log n). Thus the existence of the algorithm A implies that the one cycle vs. two cycles conjecture (Conjecture 1) is false. J 1 2 3 4 5 6 7 8 Kook Jin Ahn and Sudipto Guha . Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints . ACM Transactions on Parallel Computing (TOPC) , 4 ( 4 ): 17 , 2018 . Noga Alon , L?szl? Babai, and Alon Itai . A fast and simple randomized parallel algorithm for the maximal independent set problem . Journal of algorithms , 7 ( 4 ): 567 - 583 , 1986 . Alexandr Andoni , Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel algorithms for geometric graph problems . In Proceedings of the forty-sixth annual ACM symposium on Theory of computing , pages 574 - 583 . ACM, 2014 . Alexandr Andoni , Zhao Song , Clifford Stein , Zhengyu Wang , and Peilin Zhong . Parallel Graph Connectivity in Log Diameter Rounds , 2018 . In FOCS 2018. arXiv: 1805 .03055. Coresets meet EDCS: algorithms for matching and vertex cover on massive graphs . In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 1616 - 1635 . SIAM, 2019 . Sepehr Assadi and Sanjeev Khanna . Randomized composable coresets for matching and vertex cover . In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures , pages 3 - 12 . ACM, 2017 . Sepehr Assadi , Xiaorui Sun, and Omri Weinstein . Massively Parallel Algorithms for Finding Well-Connected Components in Sparse Graphs . arXiv preprint , 2018 . arXiv: 1805 .02974. Giorgio Ausiello , Donatella Firmani, Luigi Laura, and Emanuele Paracone . Large-scale graph biconnectivity in MapReduce . Department of Computer and System Sciences Antonio Ruberti Technical Reports , 4 ( 4 ), 2012 . Boaz Barak , Jonathan A Kelner, and David Steurer. Dictionary learning and tensor decomposition via the sum-of-squares method . In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing (STOC) , pages 143 - 151 . ACM, 2015 . arXiv: 1407 . 1543 . Journal of the ACM (JACM) , 36 ( 3 ): 643 - 670 , 1989 . Paul Beame , Paraschos Koutris, and Dan Suciu . Communication steps for parallel query processing . In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems , pages 273 - 284 . ACM, 2013 . Soheil Behnezhad , Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Brief announcement: Semi-mapreduce meets congested clique . arXiv preprint , 2018 . arXiv: 1802 .10297. Massively parallel symmetry breaking on sparse graphs: MIS and maximal matching . arXiv preprint , 2018 . arXiv: 1807 .06701. Sebastian Brandt , Manuela Fischer , and Jara Uitto . Matching and MIS for Uniformly Sparse Graphs in the Low-Memory MPC Model . arXiv preprint , 2018 . arXiv: 1807 .05374. Artur Czumaj , Jakub ??cki, Aleksander M?dry, Slobodan Mitrovi?, Krzysztof Onak, and Piotr Sankowski . Round compression for parallel matching algorithms . In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing , pages 471 - 484 . ACM, 2018 . Jeffrey Dean and Sanjay Ghemawat . MapReduce: Simplified Data Processing on Large Clusters . To appear in OSDI, page 1 , 2004 . Communications of the ACM , 51 ( 1 ): 107 - 113 , 2008 . Reinhard Diestel . Graph theory . Springer Publishing Company, Incorporated, 2018 . Alina Ene , Sungjin Im, and Benjamin Moseley . Fast clustering using MapReduce . In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining , pages 681 - 689 . ACM, 2011 . Manuela Fischer , Mohsen Ghaffari , and Jara Uitto . Simple Graph Coloring Algorithms for Congested Clique and Massively Parallel Computation . arXiv preprint , 2018 . arXiv: 1808 .08419. Michael T Goodrich , Nodari Sitchinava, and Qin Zhang. Sorting, Searching, and Simulation in the MapReduce Framework . In ISAAC, volume 7074 , pages 374 - 383 . Springer, 2011 . Sungjin Im , Benjamin Moseley, and Xiaorui Sun . Efficient massively parallel methods for dynamic programming . In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing , pages 798 - 811 . ACM, 2017 . Michael Isard , Mihai Budiu, Yuan Yu, Andrew Birrell , and Dennis Fetterly . Dryad: distributed data-parallel programs from sequential building blocks . In ACM SIGOPS operating systems review , volume 41 ( 3 ), pages 59 - 72 . ACM, 2007 . Tomasz Jurdzi?ski and Krzysztof Nowicki . MST in O(1) rounds of congested clique . In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 2620 - 2632 . SIAM, 2018 . Howard Karloff , Siddharth Suri, and Sergei Vassilvitskii . A model of computation for MapReduce . In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms , pages 938 - 948 . Society for Industrial and Applied Mathematics, 2010 . Richard M Karp , Eli Upfal , and Avi Wigderson . Constructing a perfect matching is in random NC . Combinatorica, 6 ( 1 ): 35 - 48 , 1986 . Valerie King , Chung Keung Poon, Vijaya Ramachandran, and Santanu Sinha . An optimal EREW PRAM algorithm for minimum spanning tree verification . Information Processing Letters , 62 ( 3 ): 153 - 159 , 1997 . Connected components in mapreduce and beyond . In Proceedings of the ACM Symposium on Cloud Computing , pages 1 - 13 . ACM, 2014 . Silvio Lattanzi , Benjamin Moseley, Siddharth Suri, and Sergei Vassilvitskii . Filtering: a method for solving graph problems in mapreduce . In Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures , pages 85 - 94 . ACM, 2011 . Sixue Liu and Robert E Tarjan . Simple Concurrent Labeling Algorithms for Connected Components . arXiv preprint , 2018 . arXiv: 1812 .06177. Michael Luby . A simple parallel algorithm for the maximal independent set problem . SIAM journal on computing , 15 ( 4 ): 1036 - 1053 , 1986 . arXiv preprint , 2018 . arXiv: 1807 .08745. Technical report, HARVARD UNIV CAMBRIDGE MA AIKEN COMPUTATION LAB , 1985 . Tim Roughgarden , Sergei Vassilvitskii, and Joshua R Wang. Shuffles and circuits:(on lower bounds for modern parallel computation) . In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures , pages 1 - 12 . ACM, 2016 . Yossi Shiloach and Uzi Vishkin . An O (log n) parallel connectivity algorithm . Technical report , Computer Science Department, Technion, 1980 . Robert E Tarjan and Uzi Vishkin . An efficient parallel biconnectivity algorithm . SIAM Journal on Computing , 14 ( 4 ): 862 - 874 , 1985 . Robert Endre Tarjan and Uzi Vishkin . Finding biconnected componemts and computing tree functions in logarithmic parallel time . In 25th Annual Symposium onFoundations of Computer Science , 1984 ., pages 12 - 20 . IEEE, 1984 . Leslie G Valiant . A bridging model for parallel computation . Communications of the ACM , 33 ( 8 ): 103 - 111 , 1990 . Virginia Vassilevska Williams . Multiplying matrices faster than Coppersmith-Winograd . In Proceedings of the forty-fourth annual ACM symposium on Theory of computing (STOC) , pages 887 - 898 . ACM, 2012 . Grigory Yaroslavtsev and Adithya Vadapalli . Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under Lp Distances . In International Conference on Machine Learning , pages 5596 - 5605 , 2018 . Spark: Cluster computing with working sets . HotCloud , 10 ( 10 -10): 95 , 2010 .


This is a preview of a remote PDF: http://drops.dagstuhl.de/opus/volltexte/2019/10590/pdf/LIPIcs-ICALP-2019-14.pdf

Alexandr Andoni, Clifford Stein, Peilin Zhong. Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity, LIPICS - Leibniz International Proceedings in Informatics, 2019, 14:1-14:16, DOI: 10.4230/LIPIcs.ICALP.2019.14