Congested Clique Algorithms for Graph Spanners

LIPICS - Leibniz International Proceedings in Informatics, Sep 2018

Graph spanners are sparse subgraphs that faithfully preserve the distances in the original graph up to small stretch. Spanner have been studied extensively as they have a wide range of applications ranging from distance oracles, labeling schemes and routing to solving linear systems and spectral sparsification. A k-spanner maintains pairwise distances up to multiplicative factor of k. It is a folklore that for every n-vertex graph G, one can construct a (2k-1) spanner with O(n^{1+1/k}) edges. In a distributed setting, such spanners can be constructed in the standard CONGEST model using O(k^2) rounds, when randomization is allowed. In this work, we consider spanner constructions in the congested clique model, and show: - a randomized construction of a (2k-1)-spanner with O~(n^{1+1/k}) edges in O(log k) rounds. The previous best algorithm runs in O(k) rounds; - a deterministic construction of a (2k-1)-spanner with O~(n^{1+1/k}) edges in O(log k +(log log n)^3) rounds. The previous best algorithm runs in O(k log n) rounds. This improvement is achieved by a new derandomization theorem for hitting sets which might be of independent interest; - a deterministic construction of a O(k)-spanner with O(k * n^{1+1/k}) edges in O(log k) rounds.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

Congested Clique Algorithms for Graph Spanners

D I S C Congested Clique Algorithms for Graph Spanners Merav Parter Weizmann IS 0 Rehovot 0 Israel 0 Eylon Yogev Weizmann IS 0 Rehovot 0 Israel 0 0 Introduction & Related Work Graph spanners are sparse subgraphs that faithfully preserve the distances in the original graph up to small stretch. Spanner have been studied extensively as they have a wide range of applications ranging from distance oracles, labeling schemes and routing to solving linear systems and spectral sparsification. A k-spanner maintains pairwise distances up to multiplicative factor of k. It is a folklore that for every n-vertex graph G, one can construct a (2k ? 1) spanner with O(n1+1/k) edges. In a distributed setting, such spanners can be constructed in the standard CONGEST model using O(k2) rounds, when randomization is allowed. In this work, we consider spanner constructions in the congested clique model, and show: a randomized construction of a (2k ? 1)-spanner with Oe(n1+1/k) edges in O(log k) rounds. The previous best algorithm runs in O(k) rounds; a deterministic construction of a (2k?1)-spanner with Oe(n1+1/k) edges in O(log k+(log log n)3) rounds. The previous best algorithm runs in O(k log n) rounds. This improvement is achieved by a new derandomization theorem for hitting sets which might be of independent interest; a deterministic construction of a O(k)-spanner with O(k ? n1+1/k) edges in O(log k) rounds. and phrases Distributed Graph Algorithms; Spanner; Congested Clique - Acknowledgements We are grateful to Mohsen Ghaffari for earlier discussions on congestedclique spanners via streaming ideas. We thank Roei Tell for pointing out [15]. message to each of its neighbors. In the LOCAL model, the message size is unbounded, while in the CONGEST model it is limited to O(log n) bits. One of the most notable distributed randomized constructions of (2k ? 1) spanners is by Baswana & Sen [2] which can be implemented in O(k2) rounds in the CONGEST model. Currently, there is an interesting gap between deterministic and randomized constructions in the CONGEST model, or alternatively between the deterministic construction of spanners in the LOCAL vs. the CONGEST model. Whereas the deterministic round complexity of (2k ? 1) spanners in the LOCAL model is O(k) due to [10], the best deterministic algorithm in the CONGEST model takes O(2?log n?log log n) rounds [13]. We consider the congested clique model, introduced by Lotker et al. [20]. In this model, in every round, each vertex can send O(log n) bits to each of the vertices in the graph. The congested clique model has been receiving a lot of attention recently due to its relevance to overlay networks and large scale distributed computation [17, 14, 4]. Deterministic local computation in the congested clique model. Censor et al. [7] initiated the study of deterministic local algorithms in the congested clique model by means of derandomization of randomized LOCAL algorithms. The approach of [7] can be summarized as follows. The randomized complexity of the classical local problems is polylog(n) rounds (in both LOCAL and CONGEST models). For these randomized algorithms, it is usually sufficient that the random choices made by vertices are sampled from distributions with bounded independence. Hence, any round of a randomized algorithm can be simulated by giving all nodes a shared random seed of polylog(n) bits. To completely derandomize such a round, nodes should compute (deterministically) a seed which is at least as ?good?1 as a random seed would be. This is achieved by estimating their ?local progress? when simulating the random choices using that seed. Combining the techniques of conditional expectation, pessimistic estimators and bounded independence, leads to a simple ?voting?-like algorithm in which the bits of the seed are computed bit-by-bit. The power of the congested clique is hence in providing some global leader that collects all votes in 1 round and broadcasts the winning bit value. This approach led to deterministic MIS in O(log ? log n) rounds and deterministic (2k ? 1) spanners with Oe(n1+1/k) edges in O(k log n) rounds, which also works for weighted graphs. Barenboim and Khazanov [1] presented deterministic local algorithms as a function of the graph?s arboricity. Deterministic spanners via derandomization of hitting sets. As observed by [26, 5, 13], the derandomization of the Baswana-Sen algorithm boils down into a derandomization of p-dominating sets or hitting-sets. It is a well known fact that given a collection of m sets S, each containing at least ? elements coming from a universe of size n, one can construct a hitting set Z of size O((n log m)/?). A randomized construction of such a set is immediate by picking each element into Z with probability p and applying Chernoff. A centralized deterministic construction is also well known by the greedy approach (e.g., Lemma 2.7 of [5]). In our setting we are interested in deterministic constructions of hitting sets in the congested clique model. In this setting, each vertex v knows a subset Sv of size at least ?, that consists of vertices in the O(k)-neighborhood of v, and it is required to compute a small set Z that hits (i.e., intersects) all subsets. Censor et al. [7] showed that the above mentioned randomized construction of hitting sets still holds with g = O(log n)-wise independence, 1 The random seed is usually shown to provide a large progress in expectation. The deterministically computed seed should provide a progress at least as large as the expected progress of a random seed. and presented an O(g)-round algorithm that computes a hitting set deterministically by finding a good seed of O(g log n) bits. Applying this hitting-set algorithm for computing the k levels of Baswana-Sen?s clustering yields a deterministic algorithm for (2k ? 1) spanners with O(k log n) rounds. Our Results and Approach in a Nutshell We provide improved randomized and deterministic constructions of graph spanners in the congested clique model. Our randomized solution is based on an O(log k)-round algorithm that computes the O(?n) nearest vertices in radius k/2 for every vertex v2. This induces a partitioning of the graph into sparse and dense regions. The sparse region is solved ?locally? and the dense region simulates only two phases of Baswana-Sen, leading to a total round complexity of O(log k). We show the following for n-vertex unweighted graphs. I Theorem 1. There exists a randomized algorithm in the congested clique model that constructs a (2k ? 1)-spanner with Oe(k ? n1+1/k) edges within O(log k) rounds w.h.p. Our deterministic algorithms are based on constructions of hitting-sets with short seeds. Using the pseudorandom generator of Gopalan et al. [15], we construct a hitting set with seed length O(log n ? (log log n)3) which yields the following for n-vertex unweighted graphs. I Theorem 2. There exists a deterministic algorithm in the congested clique model that constructs a (2k ? 1)-spanner with Oe(k ? n1+1/k) edges within O(log k + (log log n)3) rounds. In addition, we also show that if one settles for stretch of O(k), then a hitting-set seed of O(log n) bits is sufficient for this purpose, yielding the following construction: I Theorem 3. There exists a deterministic algorithm in the congested clique model that constructs a O(k)-spanner with O(k ? n1+1/k) edges within O(log k) rounds. A summary of our results are given in the Table 1. All results in the table are with respect to spanners with Oe(n1+1/k) edges for an unweighted n-vertex graph G. All these bounds are for the congested clique model3. In what follows we provide some technical background and then present the high level ideas of these construction. 2 To be more precise, the algorithm computes the O(n1/2?1/k) nearest vertices at distance at most k/2 ? 1. 3 Baswana-Sen [2] does not mention the congested clique model, but the best randomized solution in the congested clique is given by simulating [2]. A brief exposition of Baswana-Sen [2]. The algorithm is based on constructing k levels of clustering C0, . . . , Ck?1, where a clustering Ci = {Ci,1, . . . , } consists of vertex disjoint subsets which we call clusters. Every cluster C ? Ci has a special node that we call cluster center. For each C ? Ci, the spanner contains a depth-i tree rooted at its center and spanning all cluster vertices. Starting with the trivial clustering C0 = {{v}, v ? V }, in each phase i, the algorithm is given a clustering Ci and it computes a clustering Ci+1 by sampling the cluster center of each cluster in Ci?1 with probability n?1/k. Vertices that are adjacent to the sampled clusters join them and the remaining vertices become unclustered. For the latter, the algorithm adds some of their edges to the spanner. This construction yields a (2k ? 1) spanner with O(kn1+1/k) edges in expectation. It is easy to see that this algorithm can be simulated in the congested clique model using O(k) rounds. As observed in [26, 16], the only randomized step in Baswana-Sen is picking the cluster centers of the (i + 1)th clustering. That is, given the n1?i/k cluster centers of Ci, it is required to compute a subsample of n1?(i+1)/k clusters without having to add too many edges to the spanner (due to unclustered vertices). This is exactly the hitting-set problem where the neighboring clusters of each vertex are the sets that should be covered, and the universe is the set of centers in Ci (ideas along these lines also appear in [26, 13]). Our Approach. In the following, we provide the high level description of our construction while omitting many careful details and technicalities. We note that some of these technicalities stems from the fact that we insist on achieving the (nearly) optimal spanners, as commonly done in this area. Settling for an O(k)-spanner with Oe(kn1+1/k) edges could considerably simplify the algorithm and its analysis. The high-level idea is simple and it is based on dividing the graph G into sparse edges and dense edges, constructing a spanner for each of these subgraphs using two different techniques. This is based on the following intuition inspired by the Baswana-Sen algorithm. In Baswana-Sen, the vertices that are clustered in level-i of the clustering are vertices whose i-neighborhood is sufficiently dense, i.e., contains at least ni/k vertices. We then divide the vertices into dense vertices Vdense and sparse vertices Vsparse, where Vdense consists of vertices that have ?(?n) vertices in their k/2-ball, and Vsparse consists of the remaining vertices. This induces a partitioning of G edges into Esparse = (Vsparse ? V ) ? E(G) and Edense that contains the remaining G-edges, i.e., edges whose both endpoints are dense. Collecting Topology of Closed Neighborhood. One of the key-building blocks of our construction is an O(log k)-round algorithm that computes for each vertex u the subgraph Gk/2(u) induced on its closest O(?n) vertices within distance at most k/2 in G. Hence the algorithm computes the entire k/2-neighborhoods for the sparse vertices. For the sake of the following discussion, assume that the maximum degree in G is O(?n). Our algorithm handles the general case as well. Intuitively, collecting the k/2-neighborhood can be done in O(log k) rounds if the graph is sufficiently sparse by employing the graph exponentiation idea of [19]. In this approach, in each phase the radius of the collected neighborhood is doubled. Employing this technique in our setting gives rise to several issues. First, the input graph G is not entirely sparse but rather consists of interleaving sparse and dense regions, i.e., the k/2-neighborhood of a sparse vertex might contain dense vertices. For that purpose, in phase i of our algorithm, each vertex (either sparse or dense) should obtain a subset of its closest O(?n) vertices in its 2i neighborhood. Limiting the amount collected information is important for being able to route this information via Lenzen?s algorithm [18] in O(1) rounds in each phase. Another technicality concerns the fact that the relation ?u is in the ?n nearest vertices to v? is not necessarily symmetric. This entitles a problem where a given vertex u is ?close?4 to many vertices w, and u is not close to any of these vertices. In case where these w vertices need to receive the information from u regarding its closest neighbors (i.e., where some their close vertices are close to u), u ends up sending too many messages in a single phase. To overcome this, we carefully set the growth of the radius of the collected neighborhood in the graph exponentiation algorithm. We let only vertices that are close to each other exchange their topology information and show that this is sufficient for computing the Gk/2(u) subgraphs. This procedure is the basis for our constructions as explained next. Handling the Sparse Region. The idea is to let every sparse vertex u locally simulate a LOCAL spanner algorithm on its subgraph Gk/2(u). For that purpose, we show that the deterministic spanner algorithm of [10] which takes k rounds in general, in fact requires only k/2 rounds when running by a sparse vertex u. At the end of these k/2 rounds, for each spanner edge (u, v), at least one of the endpoints know that this edge is in the spanner. This implies that the subgraph Gk/2(u) contains all the information needed for u to locally simulate the spanner algorithm. This seemingly harmless approach has a subtle defect. Letting only the sparse vertices locally simulate a spanner algorithm might lead to a case where a certain edge (u, v) is not added by a sparse vertex due to a decision made by a dense vertex w in the local simulation u in Gk/2(u). Since w is a dense vertex it did not run the algorithm locally and hence is not aware of adding these edges5. To overcome this, the sparse vertices notify the dense vertices about their edges added in their local simulations. We show how to do it in O(1) rounds. Handling the Dense Region. In the following, we settle for stretch of (2k + 1) for ease of description. By applying the topology collecting procedure, every dense vertex v computes the set Nk/2(v) consisting of its closest ?(?n) vertices within distance k/2. The main benefit in computing these Nk/2(v) sets, is that it allows the dense vertices to ?skip? over the first k/2 ? 1 phases of Baswana-Sen, ready to apply the (k/2) phase. As described earlier, picking the centers of the clusters can be done by computing a hitting set for the set S = {Nk/2(v) | v ? Vdense}. It is easy to construct a random subset Z ? V of cardinality O(n1/2) that hits all these sets and to cluster all the dense vertices around this Z set. This creates clusters of strong diameter k (in the spanner) that cover all the dense vertices. The final step connects each pair of adjacent clusters by adding to the spanner a single edge between each such pair, this adds |Z|2 = O(n) edges to the spanner. Hitting Sets with Short Seed. The description above used a randomized solution to the following hitting set problem: given n subsets of vertices S1, . . . , Sn, each |Si| ? ?, find a small set Z that intersects all Si sets. A simple randomized solution is to choose each node v to be in Z with probability p = O(log n/?). The standard approach for derandomization is by using distributions with limited independence. Indeed, for the randomized solution to hold, it is sufficient to sample the elements from a log n-wise distribution. However, sampling an element with probability p = O(log n/?) requires roughly log n random bits, leading to a total seed length of (log2 n), which is too large for our purposes. 4 By close we mean being among the ?n nearest vertices. 5 If we ?add? one more round and simulate k/2 + 1 rounds, then there is no such problem as both endpoints of a spanner edge know that the edge is in the spanner. However, we could only collect the information up to radius k/2. Our key observation is that for any set Si the event that Si ? Z 6= ? can be expressed by a read-once DNF formula. Thus, in order to get a short seed it suffices to have a pseudoranom generator (PRG) that can ?fool? read-once DNFs. A PRG is a function that gets a short random seed and expands it to a long one which is indistinguishable from a random seed of the same length for such a formula. Luckily, such PRGs with seed length of O(log n ? (log log n)3) exist due to Gopalan et al. [15], leading to deterministic hitting-set algorithm with O((log log n)3) rounds. Graph Notations. For a vertex v ? V (G), a subgraph G0 and an integer ` ? {1, . . . , n}, let ?`(v, G0) = {u | dist(u, v, G0) ? `}. When ` = 1, we omit it and simply write ?(v, G0), also when the subgraph G0 is clear from the context, we omit it and write ?`(v). For a subset V 0 ? V , let G[V 0] be the induced subgraph of G on V 0. Given a disjoint subset of vertices C, C0, let E(C, C0, G) = {(u, v) ? E(G) | u ? C and v ? C}. we say that C and C0 are adjacent if E(C, C0, G) 6= ?. Also, for v ? V , E(v, C, G) = {(u, v) ? E(G) | u ? C}. A vertex u is incident to a subset C, if E(v, C, G) 6= ?. Road-Map. Section 2 presents algorithm NearestNeighbors to collect the topology of nearby vertices. At the end of this section, using this collected topology, the graph is partitioned into sparse and dense subgraphs. Section 3 describes the spanner construction for the sparse regime. Section 4 considers the dense regime and is organized as follows. First, Section 4.1 describes a deterministic construction spanner given an hitting-set algorithm as a black box. Then, Section 5 fills in this missing piece and shows deterministic constructions of small hitting-sets via derandomization. Finally, Section 5.3 provides an alternative deterministic construction, with improved runtime but larger stretch. 2 Collecting Topology of Nearby Neighborhood For simplicity of presentation, assume that k is even, for k odd, we replace the term (k/2 ? 1) with bk/2c. In addition, we assume k ? 6. Note that randomized constructions with O(k) rounds are known and hence one benefits from an O(log k) algorithm for a non-constant k. In the full version, we show the improved deterministic constructions for k ? {2, 3, 4, 5}. 2.1 Computing Nearest Vertices in the (k/2 ? 1) Neighborhoods In this subsection, we present an algorithm that computes the n1/2?1/k nearest vertices with distance k/2 ? 1 for every vertex v. This provides the basis for the subsequent procedures presented later on. Unfortunately, computing the nearest vertices of each vertex might require many rounds when ? = ?(?n). In particular, using Lenzen?s routing6[18], in the congested clique model, the vertices can learn their 2-neighborhoods in O(1) rounds, when the maximum degree is bounded by O(?n). Consider a vertex v that is incident to a heavy vertex u (of degree at least ?(?n)). Clearly v has ?(n1/2?1/k) vertices at distance 2, but it is not clear how v can learn their identities. Although, v is capable of receiving O(n1/2?1/k) messages, the heavy neighbor u might need to send n1/2?1/k messages to each of its neighbors, thus ?(n3/2?1/k) messages in total. To avoid this, we compute the n1/2?1/k nearest vertices in a lighter subgraph Glight of G with maximum degree ?n. The neighbors of heavy vertices might not learn their 2-neighborhood and would be handled slightly differently in Section 4. 6 Lenzen?s routing can be viewed as a O(1)-round algorithm applied when each vertex v is a target and a sender of O(n) messages. I Definition 4. A vertex v is heavy if deg(v, G) ? ?n, the set of heavy vertices is denoted by Vheavy. Let Glight = G[V \ Vheavy]. I Definition 5. For each vertex u ? V (Glight) define Nk/2?1(u) to be the set of y(u) = min{n1/2?1/k, |?k/2?1(u, Glight)|} closest vertices at distance at most (k/2 ? 1) from u (breaking ties based on IDs) in Glight. Define Tk/2?1(u) to be the truncated BFS tree rooted at u consisting of the u-v shortest path in Glight, for every v ? Nk/2?1(u). I Lemma 6. There exists a deterministic algorithm NearestNeighbors that within O(log k) rounds, computes the truncated BFS tree Tk/2?1(u) for each vertex u ? V (Glight). That is, after running Alg. NearestNeighbors, each u ? V (Glight) knows the entire tree Tk/2?1(u). Algorithm NearestNeighbors. For every integer j ? 0, we say that a vertex u is j-sparse if |?j (u, Glight)| ? n1/2?1/k, otherwise we say it is j-dense. The algorithm starts by having each non-heavy vertex compute ?2(u, Glight) in O(1) rounds using Lenzen?s algorithm. This is the only place where it is important that we work on Glight rather than on G. Next, in each phase i ? 1, vertex u collects information on vertices in its ?(i + 1)-ball in Glight, where: ?(1) = 2, and ?(i + 1) = min{2?(i) ? 1, k/2}, for every i ? {1, . . . , dlog(k/2)e}. At phase i ? {1, . . . , dlog(k/2)e} the algorithm maintains the invariant that a vertex u holds a partial BFS tree Tbi(u) in Glight consisting of the vertices Nbi(u) := V (Tbi(u)), such that: (I1) For an ?(i)-sparse vertex u, Nbi(u) = ??(i)(u). (I2) For an ?(i)-dense vertex u, Nbi(u) consists of the closest n1/2?1/k vertices to u in Glight. Note that in order to maintain the invariant in phase (i + 1), it is only required that in phase i, the ?(i)-sparse vertices would collect the relevant information, as for the ?(i)-dense vertices, it already holds that Nbi+1(u) = Nbi(u). In phase i, each vertex v (regardless of being sparse or dense) sends its partial BFS tree Tbi(v) to each vertex u only if (1) u ? Nbi(v) and (2) v ? Nbi(u). This condition can be easily checked in a single round, as every vertex u can send a message to all the vertices in its set Nbi(u). Let Nbi0+1(u) = Sv?Nbi(u) | u?Nbi(v) Nbi(v) be the subset of all received Nbi sets at vertex u. It then uses the distances to Nbi(u), and the received distances to the vertices in the Nbi sets, to compute the shortest-path distance to each w ? Nbi(v) . As a result it computes the partial tree Tbi+1(u). The subset Nbi+1(u) ? Nbi0+1(u) consists of the (at most n1/2?1/k) vertices within distance ?(i + 1) from u. This completes the description of phase i. We next analyze the algorithm and show that each phase can be implemented in O(1) rounds and that the invariant on the Tbi(u) trees is maintained. Analysis. We first show that phase i can be implemented in O(1) rounds. Note that by definition, |Nbi(u)| ? ?n for every u, and every i ? 1. Hence, by the condition of phase i, each vertex sends O(n) messages and receives O(n) messages, which can be done in O(1) rounds, using Lenzen?s routing algorithm [18]. We show that the invariant holds, by induction on i. Since all vertices first collected their second neighborhood, the invariant holds7 for i = 1. Assume it holds up to the beginning of phase i, and we now show that it holds in the beginning of phase i + 1. If u is ?(i)-dense, then u should not collect any further information in phase i and the assertion holds trivially. 7 This is the reason why we consider only Glight, as otherwise ?(1) = 0 and we would not have any progress. Figure 1 Shown is a path P between u and w where z is the first dense vertex on the ?(i)-length prefix of P . If u ?/ Nbi(z) then u, w ? Nbi(z0). Consider an ?(i)-sparse vertex u and let N?(i+1)(u) be the target set of the n1/2?1/k closest vertices at distance ?(i + 1) from u. We will fix w ? N?(i+1)(u), and show that w ? Nbi+1(u) and in addition, u has computed the shortest path to w in Glight. Let P be u-w shortest path in Glight. If all vertices z on the ?(i)-length prefix of P are ?(i)-sparse, then the claim holds as z ? Nbi(u), u ? Nbi(z), and w ? Nbi(z0) where z0 in the last vertex on the ?(i)-length prefix of P . Hence, by the induction assumption for the Nbi sets, u can compute in phase i its shortest-path to w. We next consider the remaining case where not all the vertices on the ?(i)-length path are sparse. Let z ? Nbi(u) be the first ?(i)-dense vertex (closest to u) on the ?(i)-length prefix of P . Observe that w ? Nbi(z). Otherwise, Nbi(z) contains n1/2?1/k vertices that are closer to z than w, which implies that these vertices are also closer to u than w, and hence w should not be in N?(i+1)(u) (as it is not among the closest n1/2?1/k vertices to u), leading to contradiction. Thus, if also u ? Nbi(z), then z sends to u in phase i its shortest-path to w. By the induction assumption for the Nbi(u), Nbi(z) sets, we have that u has the entire shortest-path to w. It remains to consider the case where the first ?(i)-dense vertex on P , z, does not contain u in its Nbi(z) set, hence it did not send its information on w to u in phase i. Denote x = dist(u, z, Glight) and y = dist(z, w, Glight), thus x + y = |P | ? 2?(i) ? 1. Since w ? Nbi(z) but u ?/ Nbi(z), we have that y ? x and 2y ? |P |, which implies that y ? ?(i) ? 1. Let z0 be the vertex preceding z on the P path, hence z0 also appear on the ?(i)-length prefix of P and z0 ? Ni(u). By definition, z0 is ?(i)-sparse and it also holds that u ? Nbi(z0). Since dist(z0, w, Glight) = y + 1 ? ?(i), it holds that w ? Nbi(z0). Thus, u can compute the u-w shortest-path using the z0-w shortest-path it has received from z0. For an illustration, see Figure 1. 2.2 Dividing G into Sparse and Dense Regions During the execution of NearestNeighbors every non-heavy vertex v computes the sets Nk/2?1(v) and the corresponding tree Tk/2?1(v). The vertices are next divided into dense vertices Vdense and sparse vertices Vsparse. Roughly speaking, the dense vertices are those that have at least n1/2?1/k vertices at distance at most k/2 ? 1 in G. Since the subsets of nearest neighbors are computed in Glight rather than in G, this vertex division is more delicate. I Definition 7. A vertex v is dense if either (1) it is heavy, (2) a neighbor of a heavy vertex or (3) |?k/2?1(v, Glight)| > n1/2?1/k. Otherwise, a vertex is sparse. Let Vdense, Vsparse be the dense (resp., sparse) vertices in V . I Observation 8. For k ? 6, for every dense vertex v it holds that |?k/2?1(v, G)| ? n1/2?1/k. The edges of G are partitioned into: Edense = ((Vdense ? Vdense) ? E(G)) , Esparse = (Vsparse ? V ) ? E(G) Since all the neighbors of heavy vertices are dense, it also holds that Esparse = (Vsparse ? (V \ Vheavy)) ? E(Glight). Overview of the Spanner Constructions. The algorithm contains two subprocedures, the first takes care of the sparse edge-set by constructing a spanner Hsparse ? Gsparse and the second takes care of the dense edge-set by constructing Hdense ? G. Specifically, these spanners will satisfy that for every e = (u, v) ? Gi, dist(u, v, Hi) ? 2k ? 1 for i ? {sparse, dense}. We note that the spanner Hdense ? G rather than being contained in Gdense. The reason is that the spanner Hdense might contain edges incident to sparse vertices as will be shown later. The computation of the spanner Hsparse for the sparse edges, Esparse, is done by letting each sparse vertex locally simulating a local spanner algorithm. The computation of Hdense is based on applying two levels of clustering as in Baswana-Sen. The selection of the cluster centers will be made by applying an hitting-set algorithm. 3 Handling the Sparse Subgraph In the section, we construct the spanner Hsparse that will provide a bounded stretch for the sparse edges. As we will see, the topology collected by applying Alg. NearestNeighbors allows every sparse vertex to locally simulate a deterministic spanner algorithm in its collected subgraph, and deciding which of its edges to add to the spanner based on this local view. Recall that for every sparse vertex v it holds that |?k/2?1(v, Glight)| ? n1/2?1/k where Glight = G[V \ Vheavy] and that Esparse = (Vsparse ? V ) ? E(G). Let Gsparse(u) = Gsparse[?k/2?1(u, G)]. By applying Alg. NearestNeighbors, and letting sparse vertices sends their edges to the sparse vertices in their (k/2 ? 1) neighborhoods in Glight, we have: I Claim 9. There exists a O(log k)-round deterministic algorithm, that computes for each sparse vertex v its subgraph Gsparse(v). Our algorithm is based on an adaptation of the local algorithm of [10], which is shown to satisfy the following in our context. The proof is in the full version [21]. I Lemma 10. There exists a deterministic algorithm LocalSpanner that constructs a (k ? 3) spanner in the LOCAL model, such that every sparse vertex u decides about its spanner edges within k/2 ? 1 rounds. In particular, u can simulate Alg. LocalSpanner locally on Gsparse and for every edge (u, z) not added to the spanner Hsparse, there is a path of length at most (k ? 3) in Gsparse(u) ? Hsparse. A useful property of the algorithm8 by Derbel et al. (Algorithm 1 in [10]) is that if a vertex v did not terminate after i rounds, then it must hold that |?i(v, G)| ? ni/k. Thus in our context, every sparse vertex terminates after at most k/2 ? 1 rounds9. We also show that for 8 This algorithm works only for unweighted graphs and hence our deterministic algorithms are for unweighted graphs. Currently, there are no local deterministic algorithms for weighted graphs. 9 By definition we have that |?k/2?1(u, Glight)| ? n1/2?1/k. Moreover, since Gsparse ? Glight it also holds that |?k/2?1(u, Gsparse)| ? n1/2?1/k. simulating these (k/2 ? 1) rounds of Alg. LocalSpanner by u, it is sufficient for u to know all the neighbors of its (k/2 ? 2) neighborhood in Gsparse and these edges are contained in Gsparse(u). The analysis of Lemma 10 appears in the full version of the paper. We next describe Alg. SpannerSparseRegion that computes Hsparse. Every vertex u computes Gsparse(u) in O(log k) rounds and simulate Alg. LocalSpanner in that subgraph. Let Hsparse(u) be the edges added to the spanner in the local simulation of Alg. LocalSpanner in Gsparse(u). A sparse vertex u sends to each sparse vertex v ? ?k/2?1(u, Gsparse), the set of all v-edges in Hsparse(u). Hence, each sparse vertex sends O(n) messages (at most ?n-edges to each of its at most ?n vertices in ?k/2?1(v, Gsparse)). In a symmetric manner, every vertex receives O(n) messages and this step can be done in O(1) rounds using Lenzen?s algorithm. The final spanner is given by Hsparse = Su?Vsparse Hsparse(u). The stretch argument is immediate by the correctness of Alg. LocalSpanner and the fact that all the edges added to the spanner in the local simulations are indeed added to Hsparse. The size argument is also immediate since we only add edges that Alg. LocalSpanner would have added when running by the entire graph. Algorithm SpannerSparseRegion (Code for a sparse vertex u) 1. Apply Alg. NearestNeighbors to compute Gsparse(u) for each sparse vertex u. 2. Locally simulate Alg. LocalSpanner in Gsparse(u) and let Hsparse(u) be the edges added to the spanner in Gsparse(u). 3. Send the edges of Hsparse(u) to the corresponding sparse endpoints. 4. Add the received edges to the spanner Hsparse. 4 Handling the Dense Subgraph In this section, we construct the spanner Hdense satisfying that dist(u, v, Hdense) ? 2k ? 1 for every (u, v) ? Edense. In this case, since the (k/2 ? 1) neighborhood of each dense vertex is large then there exists a small hitting that covers all these neighborhoods. The structure of our arguments is as follows. First, we describe a deterministic construction of Hdense using an hitting-set algorithm as a black box. This would immediately imply a randomized spanner construction in O(log k)-rounds. Then in Section 5, we fill in this last missing piece and show deterministic constructions of hitting sets. Constructing spanner for the dense subgraph via hitting sets. Our goal is to cluster all dense vertices into small number of low-depth clusters. This translates into the following hitting-set problem defined in [5, 28, 13]: Given a subset V 0 ? V and a set collection S = {S(v) | v ? V 0} where each |S(v)| ? ? and Sv?V 0 S(v) ? V 00, compute a subset Z ? V 00 of cardinality O(|V 00| log n/?) that intersects (i.e., hits) each subset S ? S. A hitting-set of size O(|V 00| log n/?) is denoted as a small hitting-set. We prove the next lemma by describing the construction of the spanner Hdense given an algorithm A that computes small hitting sets. In Section 5, we complement this lemma by describing several constructions of hitting sets. Let G = (V, E) be an n-vertex graph, and let ? ? [n] be a parameter. Let V 0 be a subset of nodes such that each node u ? V 0 knows a set Su where |Su| ? ?. Let S = {Su ? V : u ? V 0} and suppose that V 00 is such that S Su ? V 00. I Lemma 11. Given an algorithm A for computing a small hitting-set in rA rounds, there exists a deterministic algorithm SpannerDenseRegion for constructing the (2k ? 1) spanner Hdense within O(log k + rA) rounds. The next definition is useful in our context. `-depth Clustering. A cluster is a subset of vertices and a clustering C = {C1, . . . , C`} consists of vertex disjoint subsets. For a positive integer `, a clustering C is a `-depth clustering if for each cluster C ? C, the graph G contains a tree of depth at most ` rooted at the cluster center of C and spanning all its vertices. 4.1 Description of Algorithm SpannerDenseRegion The algorithm is based on clustering the dense vertices in two levels of clustering, in a Baswana-Sen like manner. The first clustering C1 is an (k/2 ? 1)-depth clustering covering all the dense vertices. The second clustering, C2 is an (k/2)-depth clustering that covers only a subset of the dense vertices. For k odd, let C2 be equal to C1. Defining the first level of clustering. Recall that by running Algorithm NearestNeighbors, every non-heavy vertex v ? Glight knows the set Nk/2?1(v) containing its n1/2?1/k nearest neighbors in ?k/2?1(v, Glight). For every heavy vertex v, let Nk/2?1(v) = ?(v, G). Let Vnh be the set of all non-heavy vertices that are neighbors of heavy vertices. By definition, Vnh ? Vdense. Note that for every dense vertex v ? Vdense \ Vnh, it holds that |Nk/2?1(v)| ? n1/2?1/k. The vertices u of Vnh are in Glight and hence have computed the set Nk/2?1(u), however, there is in guarantee on the size of these sets. To define the clustering of the dense vertices, Algorithm SpannerDenseRegion applies the hitting-set algorithm A on the subsets S1 = {Nk/2?1(v) | v ? Vdense \ Vnh} and the universe V . Since every set in S1 has size at least ? := n1/2?1/k, the output of algorithm A is a subset Z1 of cardinality O(n1/2+1/k) that hits all the sets in S1. We will now construct the clusters in C1 with Z1 as the cluster centers. To make sure that the clusters are vertex-disjoint and connected, we first compute the clustering in the subgraph Glight, and then cluster the remaining dense vertices that are not yet clustered. For every v ? Glight (either dense or sparse), we say that v is clustered if Z1 ? Nk/2?1(v) 6= ?. In particular, every dense vertex v for which |?k/2?1(v, Glight)| ? n1/2?1/k is clustered (the neighbors of heavy vertices are either clustered or not). For every clustered vertex v ? Glight (i.e., even sparse ones), let c1(v), denoted hereafter the cluster center of v, be the closest vertex to v in Z1 ? Nk/2?1(v), breaking shortest-path ties based on IDs. Since v knows the entire tree Tk/2?1(v), it knows the distance to all the vertices in Nk/2?1(v) and in addition, it can compute its next-hop p(v) on the v-c1(v) shortest path in Glight. Each clustered vertex v ? Glight, adds the edge (v, p(v)) to the spanner Hdense. It is easy to see that this defines a (k/2 ? 1)-depth clustering in Glight that covers all dense vertices in Glight. In particular, each cluster C has in the spanner a tree of depth at most (k/2 ? 1) that spans all the vertices in C. Note that in order for the clusters C to be connected in Hdense, it was crucial that all vertices in Glight compute their cluster centers in Nk/2?1(v), if such exists, and not only the dense vertices. We next turn to cluster the remaining dense vertices. For every heavy vertex v, let c1(v) be its closest vertex in ?(v, G) ? Z1. It then adds the edge (v, c1(v)) to the spanner Hdense and broadcasts its cluster center c1(v) to all its neighbors. Every neighbor u of a heavy vertex v that is not yet clustered, joins the cluster of c1(v) and adds the edge (u, v) to the spanner. Overall, the clusters of C1 centered at the subset Z1 cover all the dense vertices. In addition, all the vertices in a cluster C are connected in Hdense by a tree of depth k/2 ? 1. Formally, C1 = {C1(s), | s ? Z1} where C1(s) = {v | c1(v) = s}. Defining the second level of clustering. Every vertex v that is clustered in C1 broadcasts its cluster center c1(v) to all its neighbors. This allows every dense vertex v to compute the subset Nk/2(v) = {s ? Z1 | E(v, C1(s), G) 6= ?} consisting of the centers of its adjacent clusters in C1. Consider two cases depending on the cardinality of Nk/2(v). Every vertex v with |Nk/2(v)| ? n1/k log n, adds to the spanner Hdense an arbitrary edge in E(v, C1(s), G) for every s ? Nk/2(v). It remains to handle the remaining vertices Vd0ense = {v ? Vdense | |Nk/2(v)| > n1/k log n}. These vertices would be clustered in the second level of clustering C2. To compute the centers of the clusters in C2, the algorithm applies the hittingset algorithm A on the collection of subsets S2 = {Nk/2(v) | v ? Vd0ense} with ? = n1/k log n and V 00 = Z1. The output of A is a subset Z2 of cardinality O(|Z1| log n/?) = O(?n log n) that hits all the subsets in S2. The 2nd cluster-center c2(v) of a vertex v ? Vd0ense is chosen to be an arbitrary s ? Nk/2(v) ? Z2. The vertex v then adds some edge (v, u) ? E(v, C1(s), G) to the spanner Hdense. Hence, the trees spanning rooted at s ? Z2 are now extended by one additional layer resulting in a (k/2)-depth clustering. Connecting adjacent clusters. Finally, the algorithm adds to the spanner Hdense a single edge between each pairs of adjacent clusters C, C0 ? C1 ? C2, this can be done in O(1) rounds as follows. Each vertex broadcasts its cluster ID in C2. Every vertex v ? C for every cluster C ? C1 picks one incident edge to each cluster C0 ? C2 (if such exists) and sends this edge to the corresponding center of the cluster of C0 in C2. Since a vertex sends at most one message for each cluster center in C2, this can be done in O(1) rounds. Each cluster center r of the cluster C0 in C2 picks one representative edge among the edges it has received for each cluster C ? C1 and sends a notification about the selected edge to the endpoint of the edge in C. Since the cluster center sends at most one edge for every vertex this take one round. Finally, the vertices in the clusters C ? C1 add the notified edges (that they received from the centers of C2) to the spanner. This completes the description of the algorithm. We now complete the proof of Lemma 11. Proof. Recall that we assume k ? 6 and thus |?k/2?1(v)| ? n1/2?1/k, for every v ? Vdense. We first show that for every (u, v) ? Edense, dist(u, v, Hdense) ? 2k ? 1. The clustering C1 covers all the dense vertices. If u and v belong to the same cluster C in C1, the claim follows as Hdense contains an (k/2 ? 1)-depth tree that spans all the vertices in C, thus dist(u, v, Hdense) ? k ? 2. From now on assume that c1(u) 6= c1(v). We first consider the case that for both of the endpoints it holds that |Nk/2(v)|, |Nk/2(u)| ? n1/k log n. In such a case, since v is adjacent to the cluster C1 of u, the algorithm adds to Hdense at least one edge in E(v, C1, G), let it be (x, v). We have that dist(v, u, Hdense) ? dist(v, x, Hdense) + dist(x, u, Hdense) ? k ? 1 where the last inequality holds as x and u belong to the same cluster C1 in C1. Finally, it remains to consider the case where for at least one endpoint, say v, it holds that |Nk/2(v)| > n1/k log n. In such a case, v is clustered in C2. Let C1 be the cluster of u in C1 and let C2 be the cluster of v in C2. Since C1 and C2 are adjacent, the algorithm adds an edge in E(C1, C2, G), let it be (x, y) where x, u ? C1 and y, v ? C2. We have that dist(u, v, Hdense) ? dist(u, x, Hdense) + dist(x, y, Hdense) + dist(y, v, Hdense) ? 2k ? 1, where the last inequality holds as u, x belong to the same (k/2 ? 1)-depth cluster C1, and v, y belong to the same (k/2)-depth cluster C2. Finally, we bound the size of Hdense. Since the clusters in C1, C2 are vertex-disjoint, the trees spanning these clusters contain O(n) edges. For each unclustered vertex in C2, we add O(n1/k log n) edges. By the properties of the hitting-set algorithm A it holds that |Z1| = O(n1/2?1/k ? log n) and |Z2| = O(n1/2 ? log n). Thus adding one edge between each pair of clusters adds |Z1| ? |Z2| = O(n1+1/k ? log2 n) edges. J Putting All Together: Randomized spanners in O(log k) rounds. We now complete the proof of Theorem 1. For an edge (u, v) ? Esparse, the correctness follows by the correctness of Alg. LocalSpanner. We next consider the dense case. Let A be the algorithm where each v ? V 0 is added into Z with probability of log /?. By Chernoff bound, we get that w.h.p. |Z| = O(|V 0| log n/?) and Z ? Si 6= ? for every Si ? S. The correctness follows by applying Lemma 11. J Algorithm SpannerDenseRegion 1. Compute an (k/2 ? 1) clustering C1 = {C(s) | s ? Z1} centered at subset Z1. 2. For every v ? Vdense, let Nk/2(v) = {s ? Z1 | E(v, C1(s), G) 6= ?}. 3. For every v ? Vdense with |Nk/2(v)| ? n1/k log n, add to the spanner one edge in E(v, C(s), G) for every s ? Nk/2(v). 4. Compute an (k/2) clustering C2 centered at Z2 to cover the remaining dense vertices. 5. Connect (in the spanner) each pair of adjacent clusters C, C0 ? C1 ? C2 . 5 5.1 Derandomization of Hitting Sets Hitting Sets with Short Seeds The main technical part of the deterministic construction is to completely derandomize the randomized hitting-set algorithm using short seeds. We show two hitting-set constructions with different tradeoffs. The first construction is based on pseudorandom generators (PRG) for DNF formulas. The PRG will have a seed of length O(log n(log log n)3). This would serve the basis for the construction of Theorem 2. The second hitting-set construction is based on O(1)-wise independence, it uses a small seed of length O(log n) but yields a larger hitting-set. This would be the basis for the construction of Theorem 3. We begin by setting up some notation. For a set S we denote by x ? S a uniform sampling from S. For a function PRG and an index i, let PRG(s)i the ith bit of PRG(s). I Definition 12 (Pseudorandom Generators). A generator PRG : {0, 1}r ? {0, 1}n is an -pseudorandom generator (PRG) for a class C of Boolean functions if for every f ? C: | E x?{0,1}n [f (x)] ? We refer to r as the seed-length of the generator and say PRG is explicit if there is an efficient algorithm to compute PRG that runs in time poly(n, 1/ ). I Theorem 13. For every = (n) > 0, there exists an explicit pseudoranom generator, PRG : {0, 1}r ? {0, 1}n that fools all read-once DNFs on n-variables with error at most and seed-length r = O((log(n/ )) ? (log log(n/ ))3). Using the notation above, and Theorem 13 we formulate and prove the following Lemma: I Lemma 14. Let S be subset of [n] where |S| ? ? for some parameter ? ? n and let c be any constant. Then, there exists a family of hash functions H = {h : [n] ? {0, 1}} such that choosing a random function from H takes r = O(log n ? (log log n)3) random bits and for Zh = {u ? [n] : h(u) = 0} it holds that: (1) Prhh|Zh| ? Oe(n/?)i ? 2/3, and (2) Prh[S ? Zh 6= ?] ? 1 ? 1/nc. Proof. We first describe the construction of H. Let p = c0 log n/? for some large constant c0 (will be set later), and let ` = blog 1/pc. Let PRG : {0, 1}r ? {0, 1}n` be the PRG constructed in Theorem 13 for r = O(log n` ? (log log n`)3) = O(log n ? (log log n)3) and for = 1/n10c. For a string s of length r we define the hash function hs(i) as follows. First, it computes y = PRG(s). Then, it interprets y as n blocks where each block is of length ` bits, and outputs 1 if and only if all the bits of the ith block are 1. Formally, we define hs(i) = Vi` j=(i?1)`+1 PRG(s)j. We show that properties 1 and 2 hold for the set Zhs where hs ? H. We begin with property 1. For i ? [n] let Xi = hs(i) be a random variable where s ? {0, 1}r. Moreover, let X = Pn i=1 Xi. Using this notation we have that |Zhs | = X. Thus, to show property 1, we need to show that Prs?{0,1}r [X ? Oe(n/?)] ? 2/3. Let fi : {0, 1}n` ? {0, 1} be a function that outputs 1 if the ith block is all 1?s. That is, fi(y) = Vi` j=(i?1)`+1 yj. Since fi is a read-once DNF formula we have that y?{0E,1}n`[fi(y)] ? s?{E0,1}r[fi(PRG(s))] ? . Therefore, it follows that n n E[X] = X E[Xi] = X E i=1 s?{0,1}r[fi(PRG(s))] ? i=1 n X( E i=1 y?{0,1}n`[fi(y)] + ) = n(2?` + ) = Oe n ? . Then, by Markov?s inequality we get that Prs?{0,1}r [X > 3 E[X]] ? 1/3 and thus s?{0,1}r X ? Oe(n/?)i ? 1 ? s?{0,1}r Pr h Pr [X > 3 E[X]] ? 2/3. We turn to show property 2. Let S be any set of size at least ? and let g : {0, 1}n` ? {0, 1} be an indicator function for the event that the set S is covered. That is, g(y) = _ i` ^ yj. i?S j=(i?1)`+1 Since g is a read-once DNF formula, and thus we have that E y?{0,1}n` [g(y)] ? Let Yi = Vij`=(i?1)`+1 yj, and let Y = Pi?S Yi. Then E[Y ] = Pi?S E[Yi] ? ?2?` ? ?p = c0 log n. Thus, by a Chernoff bound we have that Pr[Y = 0] ? Pr[E[Y ]?Y ? c0 log n] ? 1/n2c, for a large enough constant c0 (that depends on c). Together, we get that Prs[S ? Zhs 6= ?] = Es?{0,1}r [g(PRG(s))] ? Ey?{0,1}n` [g(y)] ? 1 ? 1/nc. = Pry?{0,1}n` [Y ? 1] ? ? J We turn to show the second construction of dominating sets with short seed. In this construction the seed length of shorter, but the set is larger. By a direct application of Lemma 2.2 in [6], we get the following lemma which becomes useful for showing Theorem 3. I Lemma 15. Let S be a subset of [n] where |S| ? ? for some parameter ? ? n and let c be any constant. Then, there exists a family of hash functions H = {h : [n] ? {0, 1}} such that choosing a random function from H takes r = O(log n) random bits and for Zh = {u ? [n] : h(u) = 0} it holds that: (1) Prhh|Zh| ? O(n17/16/??)i ? 2/3, and (2) Prh[S ? Zh 6= ?] ? 1 ? 1/nc. 5.2 Deterministic Hitting Sets in the Congested Clique We next present a deterministic construction of hitting sets by means of derandomization. The round complexity of the algorithm depends on the number of random bits used by the randomized algorithms. I Theorem 16. Let G = (V, E) be an n-vertex graph, let V 0 ? V , let S = {Su ? V : u ? V 0} be a set of subsets such that each node u ? V 0 knows the set Su and |Su| ? ?, and let c be a constant. Let H = {h : [n] ? {0, 1}} be a family of hash functions such that choosing a random function from H takes gA(n, ?) random bits and for Zh = {u ? [n] : h(u) = 0} it holds that: (1) Pr[|Zh| ? fA(n, ?)] ? 2/3 and (2) for any u ? V 0: Pr[Su ? Zh 6= ?] ? 1 ? 1/nc. Then, there exists a deterministic algorithm Adet that constructs a hitting set of size O(fA(n, ?)) in O(gA(n, ?)/ log n) rounds. Proof. Our goal is to completely derandomize the process of finding Zh by using the method of conditional expectation. We follow the scheme of [7] to achieve this, and define two bad events that can occur when using a random seed of size g = gA(n, ?). Let A be the event where the hitting set Zh consists of more than fA(n, ?) vertices. Let B be the event that there exists an u ? V 0 such that Su ? Zh = ?. Let XA, XB be the corresponding indicator random variables for the events, and let X = XA + XB. Since a random seed with gA(n, ?) bits avoids both of these events with high probability, we have that E[X] < 1 where the expectation is taken over a seed of length g bits. Thus, we can use the method of conditional expectations in order to get an assignment to our random coins such that no bad event occurs, i.e., X = 0. In each step of the method, we run a distributed protocol to compute the conditional expectation. Actually, we will compute a pessimistic estimator for the conditional expectation. Letting Xu be indicator random variable for the event that Su is not hit by Zh, we can write our expectation as follows: E[X] = E[XA] + E[XB] = Pr[XA = 1] + Pr[XB = 1] = Pr[XA = 1] + Pr[?uXu = 1] Suppose we have a partial assignment to the seed, denoted by Y . Our goal is to compute the conditional expectation E[X|Y ], which translates to computing Pr[XA = 1|Y ] and Pr[?uXu = 1|Y ]. Notice that computing Pr[XA = 1|Y ] is simple since it depends only on Y (and not on the graph or the subsets S). The difficult part is computing Pr[?uXu = 1|Y ]. Instead, we use a pessimistic estimator of E[X] which avoids this difficult computation. Specifically, we define the estimator: ? = XA + Pu?V 0 Xu. Recall that for any u ? V 0 for a random g-bit length seed, it holds that Pr[Xu = 1] ? 1/nc and thus by applying a union bound over all n sets, it also holds that E[?] = Pr[XA = 1] + Pu Pr[Xu = 1] < 1. We describe how to compute the desired seed using the method of conditional expectation. We will reveal the assignment of the seed in chunks of ` = blog nc bits. In particular, we show how to compute the assignment of ` bits in the seed in O(1) rounds. Since the seed has g many bits, this will yield an O(g/ log n) round algorithm. Consider the ith chunk of the seed Yi = (y1, . . . , y`) and assume that the assignment for the first i ? 1 chunks Y1 . . . , Yi?1 have been computed. For each of the n possible assignments to Yi, we assign a node v that receives the conditional probability values Pr[Xu = 1|Y1, . . . , Yi] from all nodes u ? V 0. Notice that a node u can compute the conditional probability values Pr[Xu = 1|Y1, . . . , Yi], since u knows the IDs of the vertices in Su and thus has all the information for this computation. The node v then sums up all these values and sends them to a global leader w. The leader w can easily compute the conditional probability Pr[XA = 1|Y ], and thus using the values it received from all the nodes it can compute E[X|Y ] for of the possible n assignments to Yi. Finally, w selects the assignment (y1?, . . . , y`?) that minimizes the pessimistic estimator ? and broadcasts it to all nodes in the graph. After O(g/ log n) rounds Y has been completely fixed such that X < 1. Since XA and XB get binary values, it must be the case that XA = XB = 0, and a hitting set has been found. J Combining Lemma 14 and Lemma 15 with Theorem 16, yields: I Corollary 17. Let G = (V, E) be an n-vertex graph, let V 0, V 00 ? V , let S = {Su ? V : u ? V 0} be a set of subsets such that each node u ? V 0 knows the set Su, such that |Su| ? ? and S Su ? V 00. Then, there exists deterministic algorithms Adet, A0det in the congested clique model that construct a hitting set Z for S such that: (1) |Z| = Oe(|V 00|/?) and Adet runs in O((log log n)3) rounds. (2) |Z| = O(|V 00|17/16/??) and A0det runs in O(1) rounds. Deterministic construction in O(log k + O((log log n)3)) Rounds. by plugging Corollary 17(1) into Lemma 11. Theorem 2 follows Deterministic O(k)-Spanners in O(log k) Rounds In this subsection, we provide a proof sketch of Theorem 3. The complete proof appears in the full version. Let k ? 10. According to Section 3, it remains to consider the construction of Hdense for the dense edge set Edense. Recall that for every dense vertex v, it holds that |?k/2(v, G)| ? n1/2?1/k. Similarly to the proof of Lemma 11, we construct a (k/2 ? 1) dominating set Z for the dense vertices. However, to achieve the desired round complexity, we use the O(1)-round hitting set construction of Corollary 17(2) with parameters of ? = n1/2?1/k and V 0 = V . The output is then a hitting set Z of cardinality O(n13/16+1/(2k)) that hits all the (k/2 ? 1) neighborhoods of the dense vertices. Then, as in Alg. SpannerDenseRegion, we compute a (k/2 ? 1)-depth clustering C1 centered at Z. The key difference to Alg. SpannerDenseRegion is that |Z| is too large for allowing us to add an edge between each pair of adjacent clusters, as this would result in a spanner of size O(|Z|2). Instead, we essentially contract the clusters of C1 (i.e., contracting the intra-cluster edges) and construct the spanner recursively in the resulting contracted graph G00. Every contracted node in G00 corresponds to a cluster with a small strong diameter in the spanner. Specifically, G00 is decomposed into sparse and dense regions. Handling the sparse part is done deterministically by applying Alg. SpannerSparseRegion. To handle the dense case, we apply the hitting-set algorithm of Corollary 17(2) to cluster the dense nodes (which are in fact, contracted nodes) into |V (G00)|/?? clusters for ? = n1/2?1/k. After O(1) repetitions of the above, we will be left with a contracted graph with o(?n) vertices. At this point, we connect each pair of clusters (corresponding to these contracted nodes) in the spanner. A na?ve implementation of such an approach would yield a spanner with stretch kO(1), as the diameter of the clusters induced by the contracted nodes is increased by a k-factor in each of the phases. To avoid this blow-up in the stretch, we enjoy the fact that already after the first phase, the contracted graph G0 has O(n13/16+o(1)) nodes and hence we can allow to compute a (2k0 ? 1) spanner for G0 with k0 = 8 as this would add O(n) edges to the final spanner. Since in each of the phases (except for the first one) the stretch parameter is constant, the stretch will be bounded by O(k), and the number of edges by O(k ? n1+1/k). 1 2 Leonid Barenboim and Victor Khazanov. Distributed symmetry-breaking algorithms for congested cliques. arXiv preprint arXiv:1802.07209, 2018. Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Structures and Algorithms, 30(4):532?563, 2007. In DISC , 2017 . Soheil Behnezhad , Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Brief announcement: Semi-mapreduce meets congested clique . arXiv preprint arXiv:1802.10297 , 2018 . Arnab Bhattacharyya , Elena Grigorescu, Kyomin Jung, Sofya Raskhodnikova, and David P Woodruff. Transitive-closure spanners . SIAM Journal on Computing , 41 ( 6 ): 1380 - 1425 , 2012 . L Elisa Celis , Omer Reingold, Gil Segev, and Udi Wieder . Balls and bins: Smaller hash families and faster evaluation . SIAM Journal on Computing , 42 ( 3 ): 1030 - 1050 , 2013 . Keren Censor-Hillel , Merav Parter , and Gregory Schwartzman . Derandomizing local distributed algorithms under bandwidth restrictions . In 31 International Symposium on Distributed Computing , 2017 . Bilel Derbel and Cyril Gavoille . Fast deterministic distributed algorithms for sparse spanners . Theoretical Computer Science , 2008 . Bilel Derbel , Cyril Gavoille, and David Peleg. Deterministic distributed construction of linear stretch spanners in polylogarithmic time . In DISC , pages 179 - 192 . Springer, 2007 . Bilel Derbel , Cyril Gavoille, David Peleg, and Laurent Viennot . On the locality of distributed sparse spanner construction . In PODC , pages 273 - 282 , 2008 . Bilel Derbel , Cyril Gavoille, David Peleg, and Laurent Viennot . Local computation of nearly additive spanners . In DISC , 2009 . Bilel Derbel , Mohamed Mosbah, and Akka Zemmari . Sublinear fully distributed partition with applications . Theory of Computing Systems , 47 ( 2 ): 368 - 404 , 2010 . Manuescript , 2018 . Mohsen Ghaffari , Themis Gouleakis, Slobodan Mitrovi?, and Ronitt Rubinfeld . Improved massively parallel computation algorithms for mis, matching, and vertex cover . PODC , 2018 . Better pseudorandom generators from milder pseudorandom restrictions . In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2012 , New Brunswick, NJ, USA, October 20 - 23 , 2012 , pages 120 - 129 , 2012 . Ofer Grossman and Merav Parter . Improved deterministic distributed construction of spanners . In DISC , 2017 . James W Hegeman and Sriram V Pemmaraju . Lessons from the congested clique applied to mapreduce . Theoretical Computer Science , 608 : 268 - 281 , 2015 . Christoph Lenzen . Optimal deterministic routing and sorting on the congested clique . In the Proc. of the Int'l Symp. on Princ. of Dist. Comp. (PODC) , pages 42 - 50 , 2013 . Christoph Lenzen and Roger Wattenhofer . Brief announcement: exponential speed-up of local algorithms using non-local communication . In Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, PODC 2010 , Zurich, Switzerland, July 25-28 , 2010 , pages 295 - 296 , 2010 . Zvi Lotker , Elan Pavlov, Boaz Patt-Shamir, and David Peleg. MST construction in O(log log n) communication rounds . In the Proceedings of the Symposium on Parallel Algorithms and Architectures , pages 94 - 100 . ACM, 2003 . Merav Parter and Eylon Yogev . Congested clique algorithms for graph spanners . arXiv preprint , 2018 . arXiv: 1805 .05404. David Peleg . Distributed Computing: A Locality-sensitive Approach . SIAM, 2000 . David Peleg and Alejandro A Sch?ffer. Graph spanners . Journal of graph theory , 13 ( 1 ): 99 - 116 , 1989 . David Peleg and Jeffrey D Ullman . An optimal synchronizer for the hypercube . SIAM Journal on computing , 18 ( 4 ): 740 - 747 , 1989 . Seth Pettie . Distributed algorithms for ultrasparse spanners and linear size skeletons . Distributed Computing , 22 ( 3 ): 147 - 166 , 2010 . Liam Roditty , Mikkel Thorup, and Uri Zwick . Deterministic constructions of approximate distance oracles and spanners . In International Colloquium on Automata, Languages, and Programming , pages 261 - 272 . Springer, 2005 . Mikkel Thorup and Uri Zwick . Compact routing schemes . In Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures , pages 1 - 10 . ACM, 2001 . Virginia Vassilevska Williams . Graph algorithms - Fall 2016 , MIT, lecture notes 5 , 2016 . URL:

This is a preview of a remote PDF:

Merav Parter, Eylon Yogev. Congested Clique Algorithms for Graph Spanners, LIPICS - Leibniz International Proceedings in Informatics, 2018, 40:1-40:18, DOI: 10.4230/LIPIcs.DISC.2018.40