Preprocessing Imprecise Points for Delaunay Triangulation: Simplified and Extended

Algorithmica, Nov 2011

Suppose we want to compute the Delaunay triangulation of a set P whose points are restricted to a collection ℛ of input regions known in advance. Building on recent work by Löffler and Snoeyink, we show how to leverage our knowledge of ℛ for faster Delaunay computation. Our approach needs no fancy machinery and optimally handles a wide variety of inputs, e.g., overlapping disks of different sizes and fat regions.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

http://link.springer.com/content/pdf/10.1007%2Fs00453-010-9430-0.pdf

Preprocessing Imprecise Points for Delaunay Triangulation: Simplified and Extended

Kevin Buchin 0 1 2 3 Maarten Lffler 0 1 2 3 Pat Morin 0 1 2 3 Wolfgang Mulzer 0 1 2 3 0 P. Morin School of Computer Science, Carleton University , Ottawa, Canada 1 M. Lffler Computer Science Department, University of California , Irvine, CA 92697, USA 2 K. Buchin Dept. of Mathematics and Computer Science , TU Eindhoven, Eindhoven, The Netherlands 3 W. Mulzer ( ) Department of Computer Science, Princeton University , Princeton, NJ 08540, USA points are restricted to a collection R of input regions known in advance. Building on recent work by Lffler and Snoeyink, we show how to leverage our knowledge of R for faster Delaunay computation. Our approach needs no fancy machinery and optimally handles a wide variety of inputs, e.g., overlapping disks of different sizes and fat regions. 1 Introduction Data imprecision is a fact of life that is often ignored in the design of geometric algorithms. The input for a typical computational geometry problem is a finite point set P in R2, or more generally Rd . Traditionally, one assumes that P is known exactly, and indeed, in the 1980s and 1990s this was often justified, as much of the input data was hand-constructed for computer graphics or simulations. Nowadays, however, the input is often sensed from the real world, and thus inherently imprecise. This leads to a growing need to deal with imprecision. An early model for imprecise geometric data, motivated by finite precision of coordinates, is called -geometry [22]. Here, the input is a traditional point set P and a parameter . The true point set is unknown, but each point is guaranteed to lie in a disk of radius . Even though this model has proven fruitful and remains popular due to its simplicity [2, 23], it may often be too restrictive: imprecision regions could be more complicated than disks, and their shapes and sizes may even differ from point to point, e.g., to model imprecision from different sources, independent imprecision in different input dimensions, etc. In these settings, the extra freedom in modeling leads to more involved algorithms, but still many results are available [26, 28, 30, 31]. 1.1 Preprocessing The above results assume that the imprecise input is given once and simply has to be dealt with. While this holds in many applications, it is also often possible to get (more) precise estimates of the points, but they will only become available later when there is less time, or they come at a higher cost. For example, in the update complexity model [10, 21], each data point is given imprecisely at the beginning but can always be found precisely at a certain price. One model that has received attention lately is that of preprocessing an imprecise point set so that some structure can be computed faster when the exact points become available later. Here, we consider triangulations: let R be a collection of n planar regions, and suppose we know that the input has exactly one point from each region. The question is whether we can exploit our knowledge of R to quickly triangulate the exact input, once it is known. More precisely, we want to preprocess R into a data structure for the following problem: given a point pi from each region Ri R, compute a triangulation of P = {p1, . . . , pn}.1 For reasons that will become clear below, we call this data structure a scaffolding, and we will refer to the process of finding a triangulation for P as collapsing the scaffolding for a concrete point set P . There are many parameters to consider; not only do we want the resources required to build, store, and collapse the scaffolding to be small, but we would also like to support general classes of input regions and obtain nice (i.e., Delaunay) triangulations. In the latter case, we say that we collapse the scaffolding into a Delaunay triangulation for a concrete point set P . 1We will assume that all inputs are valid, i.e., that pi is in Ri for all i. The complexity of checking this depends on the complexity of and the representation of the regions R1, . . . , Rn. Held and Mitchell [24] show that if R consists of n disjoint unit disks, it can be preprocessed in O(n log n) time into a linear-space data structure such that a triangulation for a point set with exactly one point from every disk can be found in linear time. This is improved by Lffler and Snoeyink [29] who show that it is possible to construct a linear-size data structure such that it takes linear time to find the Delaunay triangulation for a set with one point from every disk. Both results generalize to regions with limited overlap and limited difference in shape and sizeas long as these parameters are bounded by a constant, the same results hold. However, no attempt is made to optimize the dependence on the parameters. Contrarily, van Kreveld, Lffler, and Mitchell [27] study the case when R consists of n disjoint polygons with a total of m vertices, and they obtain an O(m)-space data structure with O(m log m) preprocessing time so that a triangulation for a point set with one point in each polygon can be found in linear time. There is no restriction on the shapes and sizes of the individual regions (they do not even strictly have to be polygonal), only on the overlap. As these works already mention, a similar result for Delaunay triangulations is impossible. Djidjev and Lingas [18] show that if the points are sorted in any one direction, it still takes (n log n) time to compute their Delaunay triangulation. If R consists of vertical lines, the only information we could precompute is exactly this order (and the distances, but they can be found from the order in linear time anyway). All the algorithms above are deterministic. 1.2 Contribution Our main concern will be Delaunay triangulations. First, we show that the algorithm by Lffler and Snoeyink [29] can be simplified considerably if we are happy with randomization and expected running time guarantees. In particular, we avoid the need for linear-time polygon triangulation [14], which was the main tool in the previous algorithm. Second, although there can be no data structure for finding a Delaunay triangulation for points from arbitrary regions in o(n log n) time, we show that for realistic input we can get a better dependence on the realism parameters than in [29]. In particular, we consider k, the largest depth in the arrangement of R, and f , the smallest fatness of any region in R (defined for a region R as the largest such that for any disk D with center in R and intersecting R, area(R D) area(D)). We can build in O(n log n) time a scaffolding for R of linear size that can be collapsed to a Delaunay triangulation in time O(n log(k/f )). We also consider t , the smallest thickness (defined as the fraction of the outcircle of a region occupied by it) of any region in R, and r , the ratio between the diameters of the largest and the smallest region in R. With the same preprocessing time and space, we can obtain a scaffolding that can be collapsed to a Delaunay triangulation in O(n(log(k/t ) + log log r)) time. For comparison, the previous bound is O(nkr2/t2) [29]. Finally, we achieve similar results in various other realistic input models. We describe two different approaches. The first, which gives the same result as [29], is extremely simple and illustrates the general idea. The second approach relies on quadtrees [5, Chap. 14] and is a bit more complicated, but generalizes easily. As hinted above, we use a technique that has emerged just recently in the literature [16, 17, 27] and which we call scaffolding: in order to compute many related structures quickly, we first compute a typical structurethe scaffolding Qin a preprocessing phase. To process a given point set, we insert the points into Q and then remove Q. In the first approach, the scaffolding is a Delaunay triangulation for a suitable point set, and we will use an algorithm for hereditary Delaunay triangulations [16] to collapse it efficiently. In particular, we need the following result: Theorem 1.1 (Chazelle et al. [15], see also [16]) Let P , Q R2 be two planar point sets with |P Q| = m, and suppose that DT (P Q) is available. Then DT (P ) can be computed in expected time O(m). In the second approach, the scaffolding is an appropriate quadtree. To collapse it, we employ a recent result that connects quadtrees with Delaunay triangulations Theorem 1.2 (Buchin and Mulzer [11]) Let P be a planar n-point set and suppose that a quadtree T for P is available.2 Then DT(P ) can be computed in expected time O(n). We will say more about this theorem in Appendix B, where we also prove a slight generalization that is needed by our algorithm. 2 Unit Disks: Simplified Algorithm We begin with a very simple randomized algorithm for the original setting: given a sequence of n disjoint unit disks R = R1, . . . , Rn , we show how to preprocess R in O(n log n) time into a linear-space scaffolding that can be collapsed to a Delaunay triangulation in O(n) expected time. Let ci denote the center of Ri and for r > 0 let Rir be the disk centered at ci with radius r . The preprocessing algorithm creates a point set Q that for each Ri contains ci and 7 points equally spaced on Ri2, the boundary of Ri2. Then it computes DT (Q), the Delaunay triangulation of Q, and stores it. Since Q has 8n points, this takes O(n log n) time (e.g., [5, Sect. 9]). We will need the following useful lemma about Q. Lemma 2.1 Let X be a point set with at most one point from each Ri . Any disk D of radius r contains at most 9(r + 3)2 points of Q X. Proof Let c be the center of D. Any point of Q X in D comes from some Ri with c ci r + 2. The number of such Ri is at most the number of disjoint unit disks that fit into a disk of radius r + 3. A simple volume argument bounds this by (r + 3)2. As each Ri contributes up to 9 points, the claim follows. Given the sequence P = p1, . . . , pn of precise points, we construct DT(P ) by first inserting P into DT (Q) to obtain DT(Q P ) and then applying Theorem 1.1 2In Sect. 3, we will define precisely what it means that T is a quadtree for P . Fig. 1 C covers a constant fraction of the boundary of Ri2 and hence meets Q to remove Q. To compute DT (Q P ), we proceed as follows: for each point pi we perform a (breadth-first or depth-first) search among the triangles of DT(Q {p1, . . . , pi1}) that starts at some triangle incident to ci and never leaves Ri , until we find the triangle ti that contains pi . Then we insert pi into DT(Q {p1, . . . , pi1}) by making it adjacent to the three vertices of ti and performing Delaunay flipping [5, Sect. 9.3]. This takes time proportional to the number of triangles visited plus the degree of pi in DT (Q {p1, . . . , pi }). The next lemma allows us to bound these quantities. Proof We prove the contrapositive. Suppose C is a disk that intersects both Ri and R2 \ Ri3. Then C contains a (generally smaller) disk C tangent to Ri1 and to Ri3; see Fig. 1. The intersection of C with Ri2 is a circular arc of length 8 arcsin(1/4) > 4/7. But this means that one of the 7 points on Ri2 must lie inside C C, so the interior of C contains a point in Y . Lemma 2.2 immediately implies the following: Proof If pq is an edge of DT(Y ), there exists a circle C that has p and q on its boundary and whose interior does not meet Y . Since p Ri , Lemma 2.2 implies that C is contained in Ri3, so q Ri3, as claimed. The next two lemmas bound the number of triangles visited while inserting pi . Proof Let t be a triangle that intersects Ri , and let C be its circumcircle. Since C intersects Ri and has empty interior, Lemma 2.2 implies t C Ri3, as claimed. Lemma 2.5 At most 644 triangles of DT (Q {p1, . . . , pi }) intersect Ri . Proof The triangles that intersect Ri form the of faces of a planar graph G. By Lemma 2.4 every vertex of G lies inside Ri3, so by Lemma 2.1, there are at most v = 9(3 + 3)2 = 324 vertices, and thus at most 2v 4 = 644 faces. The final lemma bounds the degree of pi at the time it is inserted. Lemma 2.6 The degree of pi in DT(Q {p1, . . . , pi }) is at most 324. Proof By Lemma 2.3, all the neighbors of pi are inside Ri3 and by Lemma 2.1 there are at most 9(3 + 3)2 = 324 points of Q {p1, . . . , pi } in R3. i Thus, by Lemmas 2.5 and 2.6 each point of P can be inserted in constant time, so we require O(n) time to construct DT(Q P ). A further O(n) expected time is then needed to obtain DT (P ) using Theorem 1.1. This yields the desired result. Theorem 2.7 Let R = R1, . . . , Rn be a sequence of disjoint planar unit disks. In O(n log n) time and using O(n) space we can build a scaffolding for R that can be collapsed to a Delaunay triangulation for a precise input in O(n) expected time. 3 Disks of Different Sizes: Quadtree-approach We now extend Theorem 2.7 to differently-sized disks using a somewhat more involved approach. The main idea is to find a quadtree T such that each cell of T meets a bounded number of regions of R. To process an input P we locate the points of P in T , compute a quadtree T for P , and then apply Theorem 1.2. With the additional structure of the quadtree we can handle disks of varying sizes. We define a free quadtree T to be an ordered rooted tree that corresponds to a hierarchical decomposition of the plane into closed axis-aligned square boxes. Each node v of T has a square box Bv associated to it, according to the following rules: 1. If w is a descendant of v in T , then Bw is contained in Bv . 2. If v and w are not related, then the interiors of Bv and Bw are disjoint. In other words, the boxes Bv constitute a laminar family of squares in R2. We say the size of a node is the side length of its box. With each node v, we associate the cell Cv that consists of the part of Bv that is not covered by the children of v.3 Any two distinct cells Cv and Cw have disjoint interiors, and the union of the cells of all nodes of T covers the root square. 3Note that Cv could be disconnected or empty. A standard quadtree [20] is a special case of a free quadtree, and in particular has only two types of nodes: internal nodes v with exactly four children half the size of v, and leaf nodes without any children. In this section we define a (compressed) quadtree as a standard quadtree, but with the addition of a third type of node, which we call cluster node: a node v with just one child w, whose size is smaller than its parent by at least a large constant factor 2c. We require that the horizontal and vertical distances between the boundary Bw and the boundary of Bv are either zero or at least the size of Bw. Figure 2(a) shows an example of a quadtree of this type. Cluster nodes ensure that the complexity of T can be kept linear [8, Sect. 3.2]. Given a planar point set P , we say that T is a quadtree for P if the following properties hold: 1. Each leaf v of T contains at most one point of P inside Cv . 2. Other nodes v of T contain no points of P inside Cv . 3. The root box of T contains all points of P . 4. The complexity of T is O(|P |). Figure 2(b) shows a set of points for which the quadtree is valid. To apply Theorem 1.2 we need to preprocess R into a quadtree T such that any cell Cv in T intersects only constantly many disks. While we could consider the disks directly, we will instead use a quadtree T for a point set Q representing the disks. For each disk we include in Q its center and top-, bottom-, left- and rightmost point. Then, T can be constructed in O(n log n) time (see Appendix A). Lemma 3.1 Every cell Cv of T is intersected by O(1) disks in R. Proof There are three types of nodes to consider. First, if v is an internal node with four children, then Cv is empty and the condition holds trivially. Next, suppose that v is a leaf node, so Cv = Bv . If a disk D intersects Bv and does not contain a corner of Bv , then Bv must either contain Ds center or one of its four extreme points [6]. Thus, Bv intersects at most 5 disks, one for each corner and one for the point of Q it contains. See Fig. 3(a) for an example. Fig. 3 (a) At most 4 disjoint disks can intersect any given box B belonging to a leaf of the quadtree, without one of their points being inside B. (b) For a box B belonging to a parent of a cluster node with box C, slightly more disks can intersect the interior of B \ C, but not more than 4 can cover any of the four neighboring boxes, so a crude upper bound is 20 Now suppose v is a cluster node, with child w. Then Cv = Bv \ Bw . We know there are no points of Q in Cv , and there are at most four disks that have all their representative points outside Bv . So it remains to count the disks that intersect Bv , do not cover a corner of Bv , and have an extreme point or their center in Bw . For this, consider the at most four orthogonal neighbors of Bw in Bv (i.e., copies of Bw directly to the left, to the right, above and below Bw ) that lie inside Bv . Using the same argument as above, each of these neighbors meets at most four disks, and every disk D with an extreme point or center in Bw that intersects Cv also meets one of the orthogonal neighbors (if D has no extreme point or center in an orthogonal neighbor and does not cover any of its corners, it has to cover its center), which implies the claim, because by our assumption about cluster nodes all the orthogonal neighbors are completely contained in Bv . Figure 3(b) shows an example involving a cluster node. Theorem 3.2 Let R = R1, . . . , Rn be a sequence of disjoint planar disks (not necessarily all of the same size). In O(n log n) time and using O(n) space we can build a scaffolding for R that can be collapsed into a Delaunay triangulation for a concrete point set in O(n) expected time. Proof We construct Q and the quadtree T for Q as described above. For each Ri we store a list with the leaves in T that intersect it. By Lemma 3.1, the total size of these lists, and hence the complexity of the data structure, is linear. Now we describe how to process an input: let P = p1, . . . , pn be the input sequence. For each pi we find the node v of T such that pi Cv by traversing the list for Ri . This takes linear time. Since each leaf of T contains constantly many input points, we can turn T into a quadtree for P in linear time. We now compute DT(P ) via Theorem 1.2.4 4 Overlapping Disks: Deflated Quadtrees We extend the approach to disks with limited overlap. Now R contains n planar disks such that no point is covered by more than k disks. The parameter k is called 4We are slightly cheating here, because when turning T into a quadtree for P we need to be careful about handling cluster nodes that contain an input point. We will explain this in the next section, where we consider a more general setting. Fig. 4 (a) A set of points and a quadtree for it. (b) A 3-deflated version of the quadtree the depth of R, and Aronov and Har-Peled [1] showed that k can be approximated up to a constant factor in O(n log n) time. It is easily seen that finding a Delaunay triangulation for points from these regions takes (n log k) time in the worst case in the algebraic decision tree model, and we show that this bound can be achieved. The general strategy is the same as in Sect. 3. Let Q be the 5n representative points for R, and let T be a quadtree for Q. As before, T can be found in time O(n log n) and has complexity O(n). However, the cells of T can now be intersected by O(k) regions, rather than O(1). Since our data structure stores all the cell-disk incidences, this means that the space requirement would become O(nk), which is too large. However, we can avoid this by reducing the complexity of T until it only has O(n/ k) cells, while maintaining the intersection property. Then, we share the more detailed structures between the regions in R, thus saving space. For this we introduce the notion of -deflated quadtrees. For a positive integer N, a -deflated quadtree T for a point set Q has the same general structure as the quadtrees from the previous section, but it has lower complexity: each node of T can contain up to points of Q in its cell and there are only O(n/) nodes. We distinguish four different types of nodes: (i) leaves are nodes v without children, with up to points in Cv ; (ii) internal nodes v have four children of half their size covering their parent, and Cv = ; (iii) cluster nodes are, as before, nodes v with a singlemuch smallerchild, and with no points in Cv ; (iv) finally, a deflated node v has only one child wpossibly much smaller than its parentand additionally Cv may contain up to points. Cluster nodes and deflated nodes are very similar, but they play slightly different roles while collapsing the scaffolding. An example of a quadtree and a 3-deflated version of it is shown in Fig. 4. Given a quadtree T for Q, a -deflated quadtree T can be found in linear time: for every node v in T , compute nv = |Bv Q|. This takes O(n) time with a postorder traversal. Then, T is obtained by applying DeflateTree to the root of T (Algorithm 1). Since DeflateTree performs a simple top-down traversal of T , it takes O(n) time. Algorithm 1 Turning a quadtree into a -deflated quadtree Algorithm DeflateTree(v) 1. If nv , return the tree consisting of v. 2. Let Tv be the subtree rooted in v, and let z be a node in Tv with the smallest value nz such that nz > nv . Note that z could be v. 3. For all children w of z, let Tw = DeflateTree(w). 4. Build a tree Tv by picking v as the root, z as the only child of v, and linking the trees Tw to z. If v =z and Cv is not empty, then v is a deflated node; if v =z and Cv is empty, v is a cluster node. Return Tv as the result. Proof Let T be the subtree of T that contains all nodes v with nv > , and suppose that every cluster node in T has been contracted with its child. We will show that T has O(n/) nodes, which implies the claim, since no two cluster nodes are adjacent, and because all the non-cluster nodes in T which are not in T must be leaves. We count the nodes in T as follows: (i) since the leaves of T correspond to disjoint subsets of Q of size at least , there are at most n/ of them; (ii) the bound on the leaves also implies that T contains at most n/ nodes with at least two children; (iii) the number of nodes in T with a single child that has at least two children is likewise bounded; (iv) when an internal node v has a single child w that also has only a single child, then by construction v and w together must contain at least points in their cells, otherwise they would not have been two separate nodes. Thus, we can charge /2 points from Q to v, and the total number of such nodes is 2n/. Proof Treat deflated nodes like cluster nodes and note that the center and corners of every box of T can be covered by at most k disks. Now the lemma follows from the same arguments as we used in the proof of Lemma 3.1. Theorem 4.3 Let R = R1, . . . , Rn be a sequence of planar disks such that no point is covered by more than k disks. In O(n log n) time and using O(n) space we can build a scaffolding for R that can be collapsed to the Delaunay triangulation for a concrete point set in O(n log k) expected time. Proof It remains to show how to preprocess T so that it can be collapsed in time O(n log k). By Lemmas 4.1 and 4.2, the total number of disk-cell incidences in T is O(n). Thus, in O(n) total time we can find for each R R the list of nodes in T whose cells it intersects. Next, we determine for each node v in T the portion Xv of the original quadtree T inside the cell Cv and build a point location data structure for Xv . Since Xv is a partial quadtree for at most k points, it has complexity O(k), and since the Xv are disjoint, the total space requirement and construction time are linear. This finishes the preprocessing. To process a point set, we first locate the input points P in the cells of T just as in Theorem 3.2. This takes O(n) time. Then we use the point location structures for the Xv to locate P in T in total time O(n log k). Finally we turn T into a quadtree for P in time O(n log k), and find the Delaunay triangulation in time O(n), as before. We now explain how T is turned into a quadtree for P : for cells corresponding to leaf nodes, we can just use a standard algorithm for computing quadtrees, which takes O( log k) time for a cell that contains points, since = O(k) by Lemma 4.2. For cells that correspond to cluster nodes, we must work harder in order to avoid the need for the floor function: suppose v is a cluster node with child w, such that Cv contains points P from P . We sort P according to x- and y-coordinates, and then determine a bounding box for Bw and P . Let be the side length of this bounding box. Now we find in O(log k) time an integer 0 , such that |Bw| 2c and either = or |Bw| 2c 1 , where |Bw| denotes the size of Bw. Then, we set = 2c +1|Bw| if =, and = otherwise. Align four boxes of side length with Bw (see Fig. 5) and use the result as the bounding box for the quadtree T that contains Bw and P . This ensures that no edge of T intersects Bw, because all non-cluster nodes of T have size at least 2c . If during the construction of T the box Bw is again contained in a cluster node, we repeat the procedure (without the sorting step), which takes another O(log k) time. However, this cluster node will contain strictly less than points from P (because it is significantly smaller than the bounding box of T ), so the total cost for the bounding box computations cannot exceed O( log k). There is one subtlety: the bounding box of T might intersect the boundary of Bv . If this happens, we clip T to Bv . The resulting quadtree is skewed, which means that its cells can be shifted relative to their parent, and some of them may even be clipped. In Appendix B we argue that Theorem 1.2 still holds in this case. 5 Realistic Input Models For unconstrained planar input regions any algorithm for our problem requires time (n log n) even after preprocessing. This already happens for disks with arbitrary overlap. In the following we show how to obtain better bounds for realistic input models [7]. The results of the previous sections readily generalize, because a point set representing the regionslike the set Q for the disksoften exists in such models, e.g., if the regions are fat. Thus, we immediately get an algorithm for fat regions, for which we also provide a matching lower bound. We then demonstrate how to handle situations where a set like Q cannot be easily constructed by presenting an algorithm for thick regions. While fatness and thickness characterize individual regions, there are also models for sets of regions, in particular low density, for which the same algorithm as for fat regions applies. 5.1 Models In this section, we consider the following notions of realistic input regions. Fatness. Let be a constant with 0 < 1. A region R Rd is -fat if for any d -ball D with center in R and with D R = , we have volume(R D) volume(D), where volume() denotes the d -dimensional volume. Thickness. A region R Rd is -thick if volume(R) volume(Dmin(R)), where Dmin denotes the smallest d -ball enclosing R. Density. Let 1. A set of regions R in Rd is -dense if any d -ball D intersects at most regions R R with diam(R) diam(D), where diam() denotes the diameter. There is a close connection between fatness and density [32]: Lemma 5.1 Let R be a set of -fat regions in Rd such that no point is covered by more than k regions. Then R is (2d k/)-dense. Proof This is a direct generalization of the proof of Theorem 3.1 in [7]: given a d -ball D with radius r , consider the d -ball D with the same center and radius 2r . Let R be a -fat region with diam(R) r that intersects D without covering it completely. We can place a d -ball D of radius r in D with center in R and intersecting R. By fatness, R covers a -fraction of D and therefore a (/2d )-fraction of D . There can be at most 2d k/ such regions and therefore at most 2d k/ regions intersect D, as required for low density. Strong guarding sets. We will use the fact that -dense sets of regions can be guarded. Let N, and R be a set of regions in Rd . A point set Q Rd is called a -guarding set (against axis-aligned d -cubes) for R, if any axis-aligned d -cube not containing a point from Q intersects at most regions from R. For instance, the point set Q from the previous sections is a 4-guarding set for disjoint disks [6]. It is also a 4k-guarding set for disks which do not cover any point more than k times. One can show that if an axis-aligned d -cube contains m points of Q, then it intersects O(2d m) regions in R [6, Theorem 2.8]. If R is -dense, this bound can be improved as follows: assume each point in Q is assigned to a region in R. We call Q a -strong-guarding set for R if any axis-aligned d -cube containing m points of Q intersects at most regions plus the regions assigned to the m points. This definition is motivated by the following relation between density and guardability. Lemma 5.2 For a -dense set of regions R in Rd , the corners of the bounding boxes of R constitute a ( d d )-strong-guarding set (with corners assigned to the corresponding region). Proof Let Q(R) denote the corners of the bounding boxes of R, and let B be an axis-aligned d -cube. Consider the largest subset R R such that Q(R ) B = . Since R is still -dense, we can use the known relation between density and guardability, namely, that a -dense set of regions R is ( d d )-guarded by Q(R ) [4, Lemma 2.8], [6, Theorem 3.3]. Thus, B meets at most d d regions in R , as claimed. 5.2 Building and Collapsing the Scaffolding for Realistic Input Since the argument in the proof of Lemma 3.1 (and of Lemma 4.2) is based on axisaligned boxes, it directly generalizes to the quadtree for a guarding set. Lemma 5.3 Let R be a set of planar regions and Q a -strong-guarding set for R. Let T be a -deflated quadtree of Q. Then any cell of T intersects O() regions. We say that R is traceable if we can find the m incidences between the n regions in R and the l cells of a deflated quadtree T in O(l + m + n) time. For example, this holds for polygonal regions of total complexity O(|R|). Theorem 5.4 Let R = R1, . . . , Rn be a sequence of traceable planar regions with linear-size -strong-guarding set Q, where is not necessarily known, but Q is. In O(n log n) time and using O(n) space we can preprocess R into a scaffolding that can be collapsed to a concrete point set in O(n log ) expected time. Proof If were known, we could construct a -deflated quadtree for Q, and the theorem would follow directly from Lemma 5.3 and the techniques from Sect. 4. Fortunately, even if we do not know , we can find a suitable -deflated tree with O() by an exponential search on , i.e., for = 2, 4, 8, . . . we build a -deflated tree and trace the regions of R in the tree. We abort and continue with the next if there is a box that intersects more than c regions for a constant c 6, or if the tracing process takes too long. Otherwise, we have found a suitable deflated quadtree. Since it takes linear time to compute a deflated quadtree from a quadtree, this search needs O(n log n) time. Corollary 5.5 Let R = R1, . . . , Rn be a sequence of planar traceable -dense regions In O(n log n) time and using O(n) space we can build a scaffolding for R that can be collapsed to a Delaunay triangulation for a concrete point set in O(n log ) expected time. Proof Combine Lemma 5.2 with Theorem 5.4. Corollary 5.6 Let R = R1, . . . , Rn be a sequence of planar traceable -fat regions such that no point is covered by more than k regions. In O(n log n) time and using O(n) space we can build a scaffolding for R that can be collapsed to a Delaunay triangulation for a concrete point set in O(n log(k/)) expected time. Proof This follows from Lemma 5.1 and Corollary 5.5. Finally, we consider -thick regions. Although thickness does not give us a guarding set, we can still obtain results for a set of regions R if the ratio between the largest and smallest region in R is bounded (we measure the size by the radius of the smallest enclosing circle). Theorem 5.7 Let R be a sequence of n -thick k-overlapping regions such that the ratio of the largest and the smallest region in R is r . In O(n log n) time we can preprocess R into a linear-space scaffolding that can be collapsed to the Delaunay triangulation of a concrete point set in time O(n(log(k/) + log log r)). Proof Subdivide the regions into log r groups such that in each group the radii of the minimum enclosing circles differ by at most a factor of 2. For each group Ri , let i be the largest radius of a minimum enclosing circle for a region in Ri . We replace every region in Ri by a disk of radius i that contains it. This set of disks is at most (2k/)-overlapping, so we can build a data structure for Ri in O(ni log ni ) time by Theorem 4.3. To process an input, we handle each group in O(ni log(k/)) time and then use Kirkpatricks algorithm [25] to combine the triangulations in time O(n log log r). 5.3 Lower Bounds We shall now see that most of the results from the previous section cannot be improved. First, we show that the O(n log(1/)) bound for disjoint fat regions from Corollary 5.6 is optimal. Theorem 5.8 For any n and b {2, . . . , n}, there exists a set R of n planar disjoint (b 2)-fat rectangles such that it takes (n log b) time to find a Delaunay triangulation for a point set with exactly one point from each region in R in the algebraic computation tree model. Proof We adapt a lower bound by Djidjev and Lingas [18, Sect. 4]. For this we consider the problem (b, 1)-CLOSENESS: given k = n/b sequences x1, . . . , xk , each containing b real numbers in [0, 2b], we need to decide whether any xi contains two numbers with difference at most 1. To see that any algebraic decision tree for (b, 1)CLOSENESS has depth (n log b), let W Rn be defined as W = (x1, . . . , xk) | |xij xil | > 1 for 1 i k; 1 j < l b , where xij is the j th coordinate of xi . Let W = W [0, 2b]n. Since W has at least (b!)n/b connected components, Ben-Ors lower bound [3, Theorem 5] implies that we need depth at least ((n/b) log(b!)) = (n log b), as claimed. Now, we construct R. Let = 1/(3b) and consider the b intervals on the x-axis given by B[1/b], B[2/b], . . . , B[1], where B[x] denotes the one-dimensional closed -ball around x. Extend the intervals into (b 2)-fat rectangles with side lengths 2 and 2b. These rectangles constitute a group; see Fig. 6. Now, R consists Fig. 6 A group consists of b rectangles of fatness (b 2) of k congruent groups G1, . . . , Gk , translated in y-direction so that they are sufficiently far away from each other. Let x1, . . . , xk be an instance of (b, 1)-CLOSENESS. The input P consists of k sets P1, . . . , Pk , one for each xi . Each Pi contains b points, one from every rectangle in Gi : Pi = (1/b, xi1), (2/b, xi2), . . . , (1, xib) + (0, yi ), where (0, yi ) denotes the translation vector for Gi . Clearly, P can be computed in O(n) time. The argument by Djidjev and Lingas [18, Lemma 2] shows that if the Voronoi cells for Pi do not intersect the y-axis according to the sorted order of xi , then xi contains two numbers with difference less than 1. Thus, by examining the intersections between the y-axis and the Voronoi cells (which can be found in linear time given DT(P )), we can decide (b, 1)-CLOSENESS in linear time from DT (P ): either they represent the sorted order of each xi , in which case the answer is easily determined; or they do not, in which case the answer is yes. Second, we report an example that was shown to us by Mark de Berg and that indicates that the factor of log log r in Theorem 5.7 cannot be improved. Theorem 5.9 For any n there exists a set R of n disjoint 0.1-thick regions such that it takes (n log log r) time to find the Delaunay triangulations for a point set with exactly one point from each region in R, in the algebraic computation tree model. Here, r is the ratio between the largest and smallest region in R. Proof Consider the situation in Fig. 7. For k = 1, . . . , n, region Ri consists of a 2k 2k square extended by a line segment of length 2k . Clearly, all the regions Ri are 0.1-thick, and the ratio r is ( 2n). Now, if the points in the input all lie in the region bounded by the dashed rectangle, we only know that the inputs are contained in equally spaced line segments. For this case, the lower bound by Djidjev and Lingas [18] that we countered in the proof of Theorem 5.8 implies that an input needs (n log n) = (n log log r) time in the worst case to be processed, as claimed. 6 Higher Dimensions Finally, we discuss how our results generalize to higher dimensions. Of course we cannot expect linear worst-case processing time, since in general the complexity of a d -dimensional Delaunay triangulation might be (n d/2 ). However, in many typical situations this worst-case behavior does not occur, and we therefore express the Fig. 7 A lower bound for thick regions processing time in terms of the expected structural change C(P ) of a randomized incremental construction of DT(P ). The generalization of Theorem 1.2 to higher dimensions says that DT(P ) can be computed from the quadtree in time O(C(P )). Again, we need to extend the theorem to skewed quadtrees, which works just like in the planar case (Appendix B). Now, all of the realistic input models above are defined for any dimension. Consider a sequence of n traceable regions R in any fixed dimension with linear-size -strong-guarding set Q. We can find a -deflated quadtree T for Q in O(n log n) time. To process an input we can again use T to obtain a skewed quadtree for the points in O(n log ) time. Adding the time for constructing the Delaunay triangulation from the quadtree, the total expected time for a point set P is O(n log + C(P )). Thus, while the techniques generalize to higher dimensions, the approach is only useful if C(P ) is expected to be linear or near-linear. 7 Conclusions We give an alternative proof of the result by Lffler and Snoeyink [29] with a much simpler, albeit randomized, algorithm that avoids heavy machinery. For our simplified approach, we need randomization only when we apply Theorem 1.1 to remove the scaffold, and finding a deterministic algorithm for hereditary Delaunay triangulations remains an intriguing open problem. Using quadtrees, we also obtain optimal results for overlapping disks of different sizes and fat regions. Furthermore, we are able to leverage known facts about guarding sets to handle many other realistic input models. Our techniques seem quite specific to Delaunay triangulations, and it is an interesting question whether similar results can be proven for other geometric structures. Acknowledgements The results in Sect. 2 were obtained at the NICTA Workshop on Computational Geometry for Imprecise Data, December 1822, 2008 at the NICTA University of Sydney Campus. We would like to thank the other workshop participants, namely Hee-Kap Ahn, Sang Won Bae, Dan Chen, Otfried Cheong, Joachim Gudmundsson, Allan Jrgensen, Stefan Langerman, Marc Scherfenberg, Michiel Smid, Tasos Viglas and Thomas Wolle, for helpful discussions and for providing a stimulating working environment. We would also like to thank David Eppstein for answering our questions about [8] and pointing us to [9]. We would like to thank Mark de Berg for discussing realistic input models with us and for providing Theorem 5.9. Finally, we would like to thank an anonymous reviewer for valuable comments that helped improve the readability of the paper. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. Appendix A: Computing Compressed Quadtrees in O(n log n) Time It is well known that given a planar n-point set P , we can compute a compressed quadtree T for P in O(n log n) time. However, the details of the construction are a bit involved [19], and for the readers convenience we describe here a possible implementation that achieves the claimed running time on a pointer machine. The basic algorithm is as follows: sort the points in x- and y-direction, find a bounding box for P , and repeatedly split the boxes in the obvious way. If after c splits of the current box the size of the point set P inside it has not decreased, we find a bounding box for P , create a cluster node and recursively compute a quadtree for P . In order to perform the splitting efficiently, we need to maintain the x- and yorderings of the points contained in the current box. To do this, we do the splitting in two steps: first in x-direction, then in y-direction. To split in x-direction, we traverse the x-list simultaneously from both ends to find the split sets in time proportional to the size of the smaller set. Let P1 and P2 be the two resulting sets, and let n1 and n2 be their sizes, with n1 n2. Using appropriate pointers, we find the points of P1 in the y-list and remove them. Now we need to create the sorted y-list for P1: if n1 n12/2, we just sort P1 according to y-coordinate in O(n1 log n1) time. Otherwise, we use a pointer-based radix sort for P1 that takes O(n1) time. For this, we need to maintain an appropriate data structure with each of the y-lists.5 This data structure is created when a y-list is split off and updated every time the size of a y-list halves, which leads to a linear time overhead. The total time needed for the splitting in x-direction obeys the recursion T (n) = T (n1) + T (n2) + O(n1 log n1), if n1 n1/2 2 , T (n1) + T (n2) + O(n1), if n12/2 n1 n2. This solves to T (n) = O(n log n). The splitting in y-direction is done in the same way, and the total time needed for the construction of the quadtree is O(n log n), as claimed. Appendix B: From Quadtrees to Delaunay Triangulations Theorem 1.2 is proven in [11]. However, because of a technicality that is described in Sect. 4, we need a slight generalization of the theorem. Therefore, here we first briefly sketch the structure of that proof, and then show how it can be adapted to our needs. The proof is based on a chain of reductions from Delaunay triangulations to nearest-neighbor graphs to well-separated pair decompositions to quadtrees. The authors show that to compute DT(P ), it suffices to find the nearest neighbor graphs NN(P1), . . . , NN(Pt ) for each level of an appropriate gradation = P0 P1 Pt = P with ti=0 |Pi | = O(n). By a classic reduction [12], NN(P1), . . . , NN(Pt ) can be found in linear time, once we know -well-separated pair decompositions (-WSPDs) for P1, . . . , Pt of linear size and for appropriate . These -WSPDs can in turn be derived from compressed quadtrees for these sets. By assumption of the theorem, we have a quadtree T for P = Pt , and by successively pruning T , we can get quadtrees for P1, . . . , Pt1 in linear time. Now we come to the adaptation of the proof. In Sect. 4, we defined a skewed quadtree as a standard compressed quadtree, but one where clusters can be shifted relative to their parents and parts of the cluster cells might be clipped. We need to prove that Theorem 1.2 still holds in the case where we are given a skewed quadtree for P , rather than a regular quadtree. Note that all steps outlined above can go through unchanged, except that now we need to go from skewed quadtrees to -WSPDs. For this, we take the algorithm that can compute an -WSPD based on a standard compressed quadtree [12, 13] and check that it still works if the quadtree is skewed. We make a key observation about skewed quadtrees first. Observation B.1 Let v be a node of a skewed quadtree T , and let d be the size its box would have without clipping. Then there is a box B adjacent to Bv (possibly diagonally) such that the volume of B is at least cd2 for some constant c. Also, recall the definition of an -WSPD for a point set P . It is a collection of pairs {(P1, Q1), . . . , (Pm, Qm)} with Pi , Qi P such that each pair (Pi , Qi ) is well-separated, which means that the diameters of Pi and Qi are both at most times the minimum distance between Pi and Qi . Furthermore, we require that every pair (p, q) P P of distinct points is in Pi Qi or Qi Pi for exactly one i. We call m the size of the -WSPD. Now we are ready to follow the argument. Let T be a skewed quadtree for P , and let us see how to find a well-separated pair decomposition for P of linear size in time O(|P |), given T . Our presentation follows Chan [13], with some small changes to adapt the argument to skewed quadtrees. The WSPD is computed by performing the function wspd(v) on the root v of T . Refer to Algorithm 2. Here, Pv denotes the points contained in Bv , the box corresponding to v, and |Bv| is the size of Bv . Clearly, wspd computes an -WSPD for P . We need to argue that it has linear size and that the computation takes linear time. For this, it suffices to count the number of calls to wspd(v1, v2) such that Bv1 and Bv2 are not -wellseparated. By induction, we see that whenever wspd(v1, v2) is called, the size of the box corresponding to the parent of v1 is at least |Bv2 | and the size of the parent box for v2 is at least |Bv1 |. Without loss of generality, we assume |Bv1 | |Bv2 |, and let r be the parent of v1. As we noted above, we have |Br | |Bv2 |, and by successively subdividing Br in a quadtree fashion (and shifting if necessary), we obtain a box B such that Bv1 B Br , and such that |Bv2 |/2 |B| |Bv2 |. Since B and Bv2 are Algorithm 2 Finding a well-separated pair decomposition 1. If v is a leaf, return . 2. Return the union of wspd(r ) and wspd(r1, r2) for all children r and pairs of distinct children r1, r2 of v. v2. 3. Otherwise, return the union of wspd(r, v2) for all children r of v1. not -well-separated, B must be within distance O(|B|/) of Bv2 . To see that there are only constantly many such boxes, we can use a volume argument, but we need to be careful about boxes which are clipped because they are contained in a cluster that is shifted relative to its parent. However, by Observation B.1, every clipped box has an adjacent box which is only cut by a constant fraction c (if at all). Therefore, the number of boxes that are clipped too much can be at most 8 times the number of boxes that are clipped by at most c, so the total number of nearby boxes is still constant. Hence, the total size of the WSPD is linear.


This is a preview of a remote PDF: http://link.springer.com/content/pdf/10.1007%2Fs00453-010-9430-0.pdf

Kevin Buchin, Maarten Löffler, Pat Morin, Wolfgang Mulzer. Preprocessing Imprecise Points for Delaunay Triangulation: Simplified and Extended, Algorithmica, 2011, 674-693, DOI: 10.1007/s00453-010-9430-0