#### Preprocessing Imprecise Points for Delaunay Triangulation: Simplified and Extended

Kevin Buchin
0
1
2
3
Maarten Lffler
0
1
2
3
Pat Morin
0
1
2
3
Wolfgang Mulzer
0
1
2
3
0
P. Morin School of Computer Science, Carleton University
,
Ottawa, Canada
1
M. Lffler Computer Science Department, University of California
,
Irvine, CA 92697, USA
2
K. Buchin Dept. of Mathematics and Computer Science
, TU Eindhoven, Eindhoven,
The Netherlands
3
W. Mulzer ( ) Department of Computer Science, Princeton University
,
Princeton, NJ 08540, USA
points are restricted to a collection R of input regions known in advance. Building on recent work by Lffler and Snoeyink, we show how to leverage our knowledge of R for faster Delaunay computation. Our approach needs no fancy machinery and optimally handles a wide variety of inputs, e.g., overlapping disks of different sizes and fat regions.
1 Introduction
Data imprecision is a fact of life that is often ignored in the design of geometric
algorithms. The input for a typical computational geometry problem is a finite point set
P in R2, or more generally Rd . Traditionally, one assumes that P is known exactly,
and indeed, in the 1980s and 1990s this was often justified, as much of the input data
was hand-constructed for computer graphics or simulations. Nowadays, however, the
input is often sensed from the real world, and thus inherently imprecise. This leads to
a growing need to deal with imprecision.
An early model for imprecise geometric data, motivated by finite precision of
coordinates, is called -geometry [22]. Here, the input is a traditional point set P
and a parameter . The true point set is unknown, but each point is guaranteed to lie
in a disk of radius . Even though this model has proven fruitful and remains popular
due to its simplicity [2, 23], it may often be too restrictive: imprecision regions could
be more complicated than disks, and their shapes and sizes may even differ from point
to point, e.g., to model imprecision from different sources, independent imprecision
in different input dimensions, etc. In these settings, the extra freedom in modeling
leads to more involved algorithms, but still many results are available [26, 28, 30,
31].
1.1 Preprocessing
The above results assume that the imprecise input is given once and simply has to
be dealt with. While this holds in many applications, it is also often possible to get
(more) precise estimates of the points, but they will only become available later when
there is less time, or they come at a higher cost. For example, in the update complexity
model [10, 21], each data point is given imprecisely at the beginning but can always
be found precisely at a certain price.
One model that has received attention lately is that of preprocessing an imprecise
point set so that some structure can be computed faster when the exact points become
available later. Here, we consider triangulations: let R be a collection of n planar
regions, and suppose we know that the input has exactly one point from each region.
The question is whether we can exploit our knowledge of R to quickly triangulate
the exact input, once it is known. More precisely, we want to preprocess R into a
data structure for the following problem: given a point pi from each region Ri R,
compute a triangulation of P = {p1, . . . , pn}.1 For reasons that will become clear
below, we call this data structure a scaffolding, and we will refer to the process of
finding a triangulation for P as collapsing the scaffolding for a concrete point set P .
There are many parameters to consider; not only do we want the resources required to
build, store, and collapse the scaffolding to be small, but we would also like to support
general classes of input regions and obtain nice (i.e., Delaunay) triangulations. In
the latter case, we say that we collapse the scaffolding into a Delaunay triangulation
for a concrete point set P .
1We will assume that all inputs are valid, i.e., that pi is in Ri for all i. The complexity of checking this
depends on the complexity of and the representation of the regions R1, . . . , Rn.
Held and Mitchell [24] show that if R consists of n disjoint unit disks, it can be
preprocessed in O(n log n) time into a linear-space data structure such that a
triangulation for a point set with exactly one point from every disk can be found in linear
time. This is improved by Lffler and Snoeyink [29] who show that it is possible to
construct a linear-size data structure such that it takes linear time to find the Delaunay
triangulation for a set with one point from every disk. Both results generalize to
regions with limited overlap and limited difference in shape and sizeas long as these
parameters are bounded by a constant, the same results hold. However, no attempt is
made to optimize the dependence on the parameters.
Contrarily, van Kreveld, Lffler, and Mitchell [27] study the case when R consists
of n disjoint polygons with a total of m vertices, and they obtain an O(m)-space data
structure with O(m log m) preprocessing time so that a triangulation for a point set
with one point in each polygon can be found in linear time. There is no restriction
on the shapes and sizes of the individual regions (they do not even strictly have to
be polygonal), only on the overlap. As these works already mention, a similar result
for Delaunay triangulations is impossible. Djidjev and Lingas [18] show that if the
points are sorted in any one direction, it still takes (n log n) time to compute their
Delaunay triangulation. If R consists of vertical lines, the only information we could
precompute is exactly this order (and the distances, but they can be found from the
order in linear time anyway). All the algorithms above are deterministic.
1.2 Contribution
Our main concern will be Delaunay triangulations. First, we show that the algorithm
by Lffler and Snoeyink [29] can be simplified considerably if we are happy with
randomization and expected running time guarantees. In particular, we avoid the need
for linear-time polygon triangulation [14], which was the main tool in the previous
algorithm.
Second, although there can be no data structure for finding a Delaunay
triangulation for points from arbitrary regions in o(n log n) time, we show that for realistic
input we can get a better dependence on the realism parameters than in [29]. In
particular, we consider k, the largest depth in the arrangement of R, and f , the smallest
fatness of any region in R (defined for a region R as the largest such that for any
disk D with center in R and intersecting R, area(R D) area(D)). We can
build in O(n log n) time a scaffolding for R of linear size that can be collapsed to
a Delaunay triangulation in time O(n log(k/f )). We also consider t , the smallest
thickness (defined as the fraction of the outcircle of a region occupied by it) of any
region in R, and r , the ratio between the diameters of the largest and the smallest
region in R. With the same preprocessing time and space, we can obtain a
scaffolding that can be collapsed to a Delaunay triangulation in O(n(log(k/t ) + log log r))
time. For comparison, the previous bound is O(nkr2/t2) [29]. Finally, we achieve
similar results in various other realistic input models.
We describe two different approaches. The first, which gives the same result
as [29], is extremely simple and illustrates the general idea. The second approach
relies on quadtrees [5, Chap. 14] and is a bit more complicated, but generalizes
easily. As hinted above, we use a technique that has emerged just recently in the
literature [16, 17, 27] and which we call scaffolding: in order to compute many related
structures quickly, we first compute a typical structurethe scaffolding Qin a
preprocessing phase. To process a given point set, we insert the points into Q and
then remove Q. In the first approach, the scaffolding is a Delaunay triangulation for
a suitable point set, and we will use an algorithm for hereditary Delaunay
triangulations [16] to collapse it efficiently. In particular, we need the following result:
Theorem 1.1 (Chazelle et al. [15], see also [16]) Let P , Q R2 be two planar point
sets with |P Q| = m, and suppose that DT (P Q) is available. Then DT (P ) can
be computed in expected time O(m).
In the second approach, the scaffolding is an appropriate quadtree. To collapse it,
we employ a recent result that connects quadtrees with Delaunay triangulations
Theorem 1.2 (Buchin and Mulzer [11]) Let P be a planar n-point set and suppose
that a quadtree T for P is available.2 Then DT(P ) can be computed in expected time
O(n).
We will say more about this theorem in Appendix B, where we also prove a slight
generalization that is needed by our algorithm.
2 Unit Disks: Simplified Algorithm
We begin with a very simple randomized algorithm for the original setting: given a
sequence of n disjoint unit disks R = R1, . . . , Rn , we show how to preprocess R
in O(n log n) time into a linear-space scaffolding that can be collapsed to a Delaunay
triangulation in O(n) expected time.
Let ci denote the center of Ri and for r > 0 let Rir be the disk centered at ci
with radius r . The preprocessing algorithm creates a point set Q that for each Ri
contains ci and 7 points equally spaced on Ri2, the boundary of Ri2. Then it computes
DT (Q), the Delaunay triangulation of Q, and stores it. Since Q has 8n points, this
takes O(n log n) time (e.g., [5, Sect. 9]). We will need the following useful lemma
about Q.
Lemma 2.1 Let X be a point set with at most one point from each Ri . Any disk D of
radius r contains at most 9(r + 3)2 points of Q X.
Proof Let c be the center of D. Any point of Q X in D comes from some Ri with
c ci r + 2. The number of such Ri is at most the number of disjoint unit disks
that fit into a disk of radius r + 3. A simple volume argument bounds this by (r + 3)2.
As each Ri contributes up to 9 points, the claim follows.
Given the sequence P = p1, . . . , pn of precise points, we construct DT(P ) by
first inserting P into DT (Q) to obtain DT(Q P ) and then applying Theorem 1.1
2In Sect. 3, we will define precisely what it means that T is a quadtree for P .
Fig. 1 C covers a constant fraction of the boundary of Ri2 and hence meets Q
to remove Q. To compute DT (Q P ), we proceed as follows: for each point pi
we perform a (breadth-first or depth-first) search among the triangles of DT(Q
{p1, . . . , pi1}) that starts at some triangle incident to ci and never leaves Ri , until we
find the triangle ti that contains pi . Then we insert pi into DT(Q {p1, . . . , pi1})
by making it adjacent to the three vertices of ti and performing Delaunay flipping
[5, Sect. 9.3]. This takes time proportional to the number of triangles visited plus
the degree of pi in DT (Q {p1, . . . , pi }). The next lemma allows us to bound these
quantities.
Proof We prove the contrapositive. Suppose C is a disk that intersects both Ri and
R2 \ Ri3. Then C contains a (generally smaller) disk C tangent to Ri1 and to Ri3;
see Fig. 1. The intersection of C with Ri2 is a circular arc of length 8 arcsin(1/4) >
4/7. But this means that one of the 7 points on Ri2 must lie inside C C, so the
interior of C contains a point in Y .
Lemma 2.2 immediately implies the following:
Proof If pq is an edge of DT(Y ), there exists a circle C that has p and q on its
boundary and whose interior does not meet Y . Since p Ri , Lemma 2.2 implies that
C is contained in Ri3, so q Ri3, as claimed.
The next two lemmas bound the number of triangles visited while inserting pi .
Proof Let t be a triangle that intersects Ri , and let C be its circumcircle. Since C
intersects Ri and has empty interior, Lemma 2.2 implies t C Ri3, as claimed.
Lemma 2.5 At most 644 triangles of DT (Q {p1, . . . , pi }) intersect Ri .
Proof The triangles that intersect Ri form the of faces of a planar graph G. By
Lemma 2.4 every vertex of G lies inside Ri3, so by Lemma 2.1, there are at most
v = 9(3 + 3)2 = 324 vertices, and thus at most 2v 4 = 644 faces.
The final lemma bounds the degree of pi at the time it is inserted.
Lemma 2.6 The degree of pi in DT(Q {p1, . . . , pi }) is at most 324.
Proof By Lemma 2.3, all the neighbors of pi are inside Ri3 and by Lemma 2.1 there
are at most 9(3 + 3)2 = 324 points of Q {p1, . . . , pi } in R3.
i
Thus, by Lemmas 2.5 and 2.6 each point of P can be inserted in constant time, so
we require O(n) time to construct DT(Q P ). A further O(n) expected time is then
needed to obtain DT (P ) using Theorem 1.1. This yields the desired result.
Theorem 2.7 Let R = R1, . . . , Rn be a sequence of disjoint planar unit disks. In
O(n log n) time and using O(n) space we can build a scaffolding for R that can be
collapsed to a Delaunay triangulation for a precise input in O(n) expected time.
3 Disks of Different Sizes: Quadtree-approach
We now extend Theorem 2.7 to differently-sized disks using a somewhat more
involved approach. The main idea is to find a quadtree T such that each cell of T meets
a bounded number of regions of R. To process an input P we locate the points of P
in T , compute a quadtree T for P , and then apply Theorem 1.2. With the additional
structure of the quadtree we can handle disks of varying sizes.
We define a free quadtree T to be an ordered rooted tree that corresponds to a
hierarchical decomposition of the plane into closed axis-aligned square boxes. Each
node v of T has a square box Bv associated to it, according to the following rules:
1. If w is a descendant of v in T , then Bw is contained in Bv .
2. If v and w are not related, then the interiors of Bv and Bw are disjoint.
In other words, the boxes Bv constitute a laminar family of squares in R2. We say the
size of a node is the side length of its box. With each node v, we associate the cell
Cv that consists of the part of Bv that is not covered by the children of v.3 Any two
distinct cells Cv and Cw have disjoint interiors, and the union of the cells of all nodes
of T covers the root square.
3Note that Cv could be disconnected or empty.
A standard quadtree [20] is a special case of a free quadtree, and in particular has
only two types of nodes: internal nodes v with exactly four children half the size
of v, and leaf nodes without any children. In this section we define a (compressed)
quadtree as a standard quadtree, but with the addition of a third type of node, which
we call cluster node: a node v with just one child w, whose size is smaller than its
parent by at least a large constant factor 2c. We require that the horizontal and vertical
distances between the boundary Bw and the boundary of Bv are either zero or at least
the size of Bw. Figure 2(a) shows an example of a quadtree of this type. Cluster nodes
ensure that the complexity of T can be kept linear [8, Sect. 3.2].
Given a planar point set P , we say that T is a quadtree for P if the following
properties hold:
1. Each leaf v of T contains at most one point of P inside Cv .
2. Other nodes v of T contain no points of P inside Cv .
3. The root box of T contains all points of P .
4. The complexity of T is O(|P |).
Figure 2(b) shows a set of points for which the quadtree is valid.
To apply Theorem 1.2 we need to preprocess R into a quadtree T such that any
cell Cv in T intersects only constantly many disks. While we could consider the disks
directly, we will instead use a quadtree T for a point set Q representing the disks. For
each disk we include in Q its center and top-, bottom-, left- and rightmost point.
Then, T can be constructed in O(n log n) time (see Appendix A).
Lemma 3.1 Every cell Cv of T is intersected by O(1) disks in R.
Proof There are three types of nodes to consider. First, if v is an internal node with
four children, then Cv is empty and the condition holds trivially. Next, suppose that
v is a leaf node, so Cv = Bv . If a disk D intersects Bv and does not contain a corner
of Bv , then Bv must either contain Ds center or one of its four extreme points [6].
Thus, Bv intersects at most 5 disks, one for each corner and one for the point of Q it
contains. See Fig. 3(a) for an example.
Fig. 3 (a) At most 4 disjoint
disks can intersect any given
box B belonging to a leaf of the
quadtree, without one of their
points being inside B. (b) For a
box B belonging to a parent of a
cluster node with box C, slightly
more disks can intersect the
interior of B \ C, but not more
than 4 can cover any of the four
neighboring boxes, so a crude
upper bound is 20
Now suppose v is a cluster node, with child w. Then Cv = Bv \ Bw . We know
there are no points of Q in Cv , and there are at most four disks that have all their
representative points outside Bv . So it remains to count the disks that intersect Bv ,
do not cover a corner of Bv , and have an extreme point or their center in Bw . For
this, consider the at most four orthogonal neighbors of Bw in Bv (i.e., copies of Bw
directly to the left, to the right, above and below Bw ) that lie inside Bv . Using the
same argument as above, each of these neighbors meets at most four disks, and every
disk D with an extreme point or center in Bw that intersects Cv also meets one of the
orthogonal neighbors (if D has no extreme point or center in an orthogonal neighbor
and does not cover any of its corners, it has to cover its center), which implies the
claim, because by our assumption about cluster nodes all the orthogonal neighbors
are completely contained in Bv . Figure 3(b) shows an example involving a cluster
node.
Theorem 3.2 Let R = R1, . . . , Rn be a sequence of disjoint planar disks (not
necessarily all of the same size). In O(n log n) time and using O(n) space we can build a
scaffolding for R that can be collapsed into a Delaunay triangulation for a concrete
point set in O(n) expected time.
Proof We construct Q and the quadtree T for Q as described above. For each Ri
we store a list with the leaves in T that intersect it. By Lemma 3.1, the total size of
these lists, and hence the complexity of the data structure, is linear. Now we describe
how to process an input: let P = p1, . . . , pn be the input sequence. For each pi we
find the node v of T such that pi Cv by traversing the list for Ri . This takes linear
time. Since each leaf of T contains constantly many input points, we can turn T into
a quadtree for P in linear time. We now compute DT(P ) via Theorem 1.2.4
4 Overlapping Disks: Deflated Quadtrees
We extend the approach to disks with limited overlap. Now R contains n planar
disks such that no point is covered by more than k disks. The parameter k is called
4We are slightly cheating here, because when turning T into a quadtree for P we need to be careful about
handling cluster nodes that contain an input point. We will explain this in the next section, where we
consider a more general setting.
Fig. 4 (a) A set of points and a quadtree for it. (b) A 3-deflated version of the quadtree
the depth of R, and Aronov and Har-Peled [1] showed that k can be approximated
up to a constant factor in O(n log n) time. It is easily seen that finding a Delaunay
triangulation for points from these regions takes (n log k) time in the worst case in
the algebraic decision tree model, and we show that this bound can be achieved.
The general strategy is the same as in Sect. 3. Let Q be the 5n representative
points for R, and let T be a quadtree for Q. As before, T can be found in time
O(n log n) and has complexity O(n). However, the cells of T can now be intersected
by O(k) regions, rather than O(1). Since our data structure stores all the cell-disk
incidences, this means that the space requirement would become O(nk), which is
too large. However, we can avoid this by reducing the complexity of T until it only
has O(n/ k) cells, while maintaining the intersection property. Then, we share the
more detailed structures between the regions in R, thus saving space. For this we
introduce the notion of -deflated quadtrees.
For a positive integer N, a -deflated quadtree T for a point set Q has the
same general structure as the quadtrees from the previous section, but it has lower
complexity: each node of T can contain up to points of Q in its cell and there are
only O(n/) nodes. We distinguish four different types of nodes: (i) leaves are nodes
v without children, with up to points in Cv ; (ii) internal nodes v have four children
of half their size covering their parent, and Cv = ; (iii) cluster nodes are, as before,
nodes v with a singlemuch smallerchild, and with no points in Cv ; (iv) finally,
a deflated node v has only one child wpossibly much smaller than its parentand
additionally Cv may contain up to points. Cluster nodes and deflated nodes are
very similar, but they play slightly different roles while collapsing the scaffolding.
An example of a quadtree and a 3-deflated version of it is shown in Fig. 4.
Given a quadtree T for Q, a -deflated quadtree T can be found in linear time:
for every node v in T , compute nv = |Bv Q|. This takes O(n) time with a postorder
traversal. Then, T is obtained by applying DeflateTree to the root of T
(Algorithm 1). Since DeflateTree performs a simple top-down traversal of T , it takes
O(n) time.
Algorithm 1 Turning a quadtree into a -deflated quadtree
Algorithm DeflateTree(v)
1. If nv , return the tree consisting of v.
2. Let Tv be the subtree rooted in v, and let z be a node in Tv with the smallest value
nz such that nz > nv . Note that z could be v.
3. For all children w of z, let Tw = DeflateTree(w).
4. Build a tree Tv by picking v as the root, z as the only child of v, and linking the
trees Tw to z. If v =z and Cv is not empty, then v is a deflated node; if v =z and
Cv is empty, v is a cluster node. Return Tv as the result.
Proof Let T be the subtree of T that contains all nodes v with nv > , and suppose
that every cluster node in T has been contracted with its child. We will show that T
has O(n/) nodes, which implies the claim, since no two cluster nodes are adjacent,
and because all the non-cluster nodes in T which are not in T must be leaves. We
count the nodes in T as follows: (i) since the leaves of T correspond to disjoint
subsets of Q of size at least , there are at most n/ of them; (ii) the bound on the
leaves also implies that T contains at most n/ nodes with at least two children;
(iii) the number of nodes in T with a single child that has at least two children is
likewise bounded; (iv) when an internal node v has a single child w that also has only
a single child, then by construction v and w together must contain at least points
in their cells, otherwise they would not have been two separate nodes. Thus, we can
charge /2 points from Q to v, and the total number of such nodes is 2n/.
Proof Treat deflated nodes like cluster nodes and note that the center and corners of
every box of T can be covered by at most k disks. Now the lemma follows from the
same arguments as we used in the proof of Lemma 3.1.
Theorem 4.3 Let R = R1, . . . , Rn be a sequence of planar disks such that no point
is covered by more than k disks. In O(n log n) time and using O(n) space we can
build a scaffolding for R that can be collapsed to the Delaunay triangulation for a
concrete point set in O(n log k) expected time.
Proof It remains to show how to preprocess T so that it can be collapsed in time
O(n log k). By Lemmas 4.1 and 4.2, the total number of disk-cell incidences in T is
O(n). Thus, in O(n) total time we can find for each R R the list of nodes in T
whose cells it intersects. Next, we determine for each node v in T the portion Xv of
the original quadtree T inside the cell Cv and build a point location data structure for
Xv . Since Xv is a partial quadtree for at most k points, it has complexity O(k), and
since the Xv are disjoint, the total space requirement and construction time are linear.
This finishes the preprocessing.
To process a point set, we first locate the input points P in the cells of T just as in
Theorem 3.2. This takes O(n) time. Then we use the point location structures for the
Xv to locate P in T in total time O(n log k). Finally we turn T into a quadtree for P
in time O(n log k), and find the Delaunay triangulation in time O(n), as before.
We now explain how T is turned into a quadtree for P : for cells corresponding
to leaf nodes, we can just use a standard algorithm for computing quadtrees, which
takes O( log k) time for a cell that contains points, since = O(k) by Lemma 4.2.
For cells that correspond to cluster nodes, we must work harder in order to avoid the
need for the floor function: suppose v is a cluster node with child w, such that Cv
contains points P from P . We sort P according to x- and y-coordinates, and then
determine a bounding box for Bw and P . Let be the side length of this bounding
box. Now we find in O(log k) time an integer 0 , such that |Bw| 2c
and either = or |Bw| 2c 1 , where |Bw| denotes the size of Bw. Then, we
set = 2c +1|Bw| if =, and = otherwise. Align four boxes of side length
with Bw (see Fig. 5) and use the result as the bounding box for the quadtree T
that contains Bw and P . This ensures that no edge of T intersects Bw, because all
non-cluster nodes of T have size at least 2c . If during the construction of T
the box Bw is again contained in a cluster node, we repeat the procedure (without
the sorting step), which takes another O(log k) time. However, this cluster node will
contain strictly less than points from P (because it is significantly smaller than
the bounding box of T ), so the total cost for the bounding box computations cannot
exceed O( log k). There is one subtlety: the bounding box of T might intersect the
boundary of Bv . If this happens, we clip T to Bv . The resulting quadtree is skewed,
which means that its cells can be shifted relative to their parent, and some of them
may even be clipped. In Appendix B we argue that Theorem 1.2 still holds in this
case.
5 Realistic Input Models
For unconstrained planar input regions any algorithm for our problem requires time
(n log n) even after preprocessing. This already happens for disks with arbitrary
overlap. In the following we show how to obtain better bounds for realistic input
models [7]. The results of the previous sections readily generalize, because a point set
representing the regionslike the set Q for the disksoften exists in such models,
e.g., if the regions are fat. Thus, we immediately get an algorithm for fat regions, for
which we also provide a matching lower bound. We then demonstrate how to handle
situations where a set like Q cannot be easily constructed by presenting an algorithm
for thick regions. While fatness and thickness characterize individual regions, there
are also models for sets of regions, in particular low density, for which the same
algorithm as for fat regions applies.
5.1 Models
In this section, we consider the following notions of realistic input regions.
Fatness. Let be a constant with 0 < 1. A region R Rd is -fat if for any
d -ball D with center in R and with D R = , we have volume(R D)
volume(D), where volume() denotes the d -dimensional volume.
Thickness. A region R Rd is -thick if volume(R) volume(Dmin(R)),
where Dmin denotes the smallest d -ball enclosing R.
Density. Let 1. A set of regions R in Rd is -dense if any d -ball D intersects
at most regions R R with diam(R) diam(D), where diam() denotes the
diameter.
There is a close connection between fatness and density [32]:
Lemma 5.1 Let R be a set of -fat regions in Rd such that no point is covered by
more than k regions. Then R is (2d k/)-dense.
Proof This is a direct generalization of the proof of Theorem 3.1 in [7]: given a d -ball
D with radius r , consider the d -ball D with the same center and radius 2r . Let R
be a -fat region with diam(R) r that intersects D without covering it completely.
We can place a d -ball D of radius r in D with center in R and intersecting R. By
fatness, R covers a -fraction of D and therefore a (/2d )-fraction of D . There can
be at most 2d k/ such regions and therefore at most 2d k/ regions intersect D, as
required for low density.
Strong guarding sets. We will use the fact that -dense sets of regions can be guarded.
Let N, and R be a set of regions in Rd . A point set Q Rd is called a -guarding
set (against axis-aligned d -cubes) for R, if any axis-aligned d -cube not containing a
point from Q intersects at most regions from R. For instance, the point set Q from
the previous sections is a 4-guarding set for disjoint disks [6]. It is also a 4k-guarding
set for disks which do not cover any point more than k times.
One can show that if an axis-aligned d -cube contains m points of Q, then it
intersects O(2d m) regions in R [6, Theorem 2.8]. If R is -dense, this bound can be
improved as follows: assume each point in Q is assigned to a region in R. We call Q
a -strong-guarding set for R if any axis-aligned d -cube containing m points of Q
intersects at most regions plus the regions assigned to the m points. This definition
is motivated by the following relation between density and guardability.
Lemma 5.2 For a -dense set of regions R in Rd , the corners of the bounding boxes
of R constitute a ( d d )-strong-guarding set (with corners assigned to the
corresponding region).
Proof Let Q(R) denote the corners of the bounding boxes of R, and let B be an
axis-aligned d -cube. Consider the largest subset R R such that Q(R ) B = .
Since R is still -dense, we can use the known relation between density and
guardability, namely, that a -dense set of regions R is ( d d )-guarded by Q(R ) [4,
Lemma 2.8], [6, Theorem 3.3]. Thus, B meets at most d d regions in R , as
claimed.
5.2 Building and Collapsing the Scaffolding for Realistic Input
Since the argument in the proof of Lemma 3.1 (and of Lemma 4.2) is based on
axisaligned boxes, it directly generalizes to the quadtree for a guarding set.
Lemma 5.3 Let R be a set of planar regions and Q a -strong-guarding set for R.
Let T be a -deflated quadtree of Q. Then any cell of T intersects O() regions.
We say that R is traceable if we can find the m incidences between the n regions
in R and the l cells of a deflated quadtree T in O(l + m + n) time. For example, this
holds for polygonal regions of total complexity O(|R|).
Theorem 5.4 Let R = R1, . . . , Rn be a sequence of traceable planar regions with
linear-size -strong-guarding set Q, where is not necessarily known, but Q is. In
O(n log n) time and using O(n) space we can preprocess R into a scaffolding that
can be collapsed to a concrete point set in O(n log ) expected time.
Proof If were known, we could construct a -deflated quadtree for Q, and the
theorem would follow directly from Lemma 5.3 and the techniques from Sect. 4.
Fortunately, even if we do not know , we can find a suitable -deflated tree with
O() by an exponential search on , i.e., for = 2, 4, 8, . . . we build a -deflated
tree and trace the regions of R in the tree. We abort and continue with the next
if there is a box that intersects more than c regions for a constant c 6, or if the
tracing process takes too long. Otherwise, we have found a suitable deflated quadtree.
Since it takes linear time to compute a deflated quadtree from a quadtree, this search
needs O(n log n) time.
Corollary 5.5 Let R = R1, . . . , Rn be a sequence of planar traceable -dense
regions In O(n log n) time and using O(n) space we can build a scaffolding for R that
can be collapsed to a Delaunay triangulation for a concrete point set in O(n log )
expected time.
Proof Combine Lemma 5.2 with Theorem 5.4.
Corollary 5.6 Let R = R1, . . . , Rn be a sequence of planar traceable -fat regions
such that no point is covered by more than k regions. In O(n log n) time and using
O(n) space we can build a scaffolding for R that can be collapsed to a Delaunay
triangulation for a concrete point set in O(n log(k/)) expected time.
Proof This follows from Lemma 5.1 and Corollary 5.5.
Finally, we consider -thick regions. Although thickness does not give us a
guarding set, we can still obtain results for a set of regions R if the ratio between the largest
and smallest region in R is bounded (we measure the size by the radius of the smallest
enclosing circle).
Theorem 5.7 Let R be a sequence of n -thick k-overlapping regions such that the
ratio of the largest and the smallest region in R is r . In O(n log n) time we can
preprocess R into a linear-space scaffolding that can be collapsed to the Delaunay
triangulation of a concrete point set in time O(n(log(k/) + log log r)).
Proof Subdivide the regions into log r groups such that in each group the radii of
the minimum enclosing circles differ by at most a factor of 2. For each group Ri ,
let i be the largest radius of a minimum enclosing circle for a region in Ri . We
replace every region in Ri by a disk of radius i that contains it. This set of disks is
at most (2k/)-overlapping, so we can build a data structure for Ri in O(ni log ni )
time by Theorem 4.3. To process an input, we handle each group in O(ni log(k/))
time and then use Kirkpatricks algorithm [25] to combine the triangulations in time
O(n log log r).
5.3 Lower Bounds
We shall now see that most of the results from the previous section cannot be
improved. First, we show that the O(n log(1/)) bound for disjoint fat regions from
Corollary 5.6 is optimal.
Theorem 5.8 For any n and b {2, . . . , n}, there exists a set R of n planar disjoint
(b 2)-fat rectangles such that it takes (n log b) time to find a Delaunay
triangulation for a point set with exactly one point from each region in R in the algebraic
computation tree model.
Proof We adapt a lower bound by Djidjev and Lingas [18, Sect. 4]. For this we
consider the problem (b, 1)-CLOSENESS: given k = n/b sequences x1, . . . , xk , each
containing b real numbers in [0, 2b], we need to decide whether any xi contains two
numbers with difference at most 1. To see that any algebraic decision tree for (b,
1)CLOSENESS has depth (n log b), let W Rn be defined as
W = (x1, . . . , xk) | |xij xil | > 1 for 1 i k; 1 j < l b ,
where xij is the j th coordinate of xi . Let W = W [0, 2b]n. Since W has at least
(b!)n/b connected components, Ben-Ors lower bound [3, Theorem 5] implies that we
need depth at least ((n/b) log(b!)) = (n log b), as claimed.
Now, we construct R. Let = 1/(3b) and consider the b intervals on the x-axis
given by B[1/b], B[2/b], . . . , B[1], where B[x] denotes the one-dimensional
closed -ball around x. Extend the intervals into (b 2)-fat rectangles with side
lengths 2 and 2b. These rectangles constitute a group; see Fig. 6. Now, R consists
Fig. 6 A group consists of b rectangles of fatness (b 2)
of k congruent groups G1, . . . , Gk , translated in y-direction so that they are
sufficiently far away from each other. Let x1, . . . , xk be an instance of (b, 1)-CLOSENESS.
The input P consists of k sets P1, . . . , Pk , one for each xi . Each Pi contains b points,
one from every rectangle in Gi : Pi = (1/b, xi1), (2/b, xi2), . . . , (1, xib) + (0, yi ),
where (0, yi ) denotes the translation vector for Gi . Clearly, P can be computed in
O(n) time. The argument by Djidjev and Lingas [18, Lemma 2] shows that if the
Voronoi cells for Pi do not intersect the y-axis according to the sorted order of xi ,
then xi contains two numbers with difference less than 1. Thus, by examining the
intersections between the y-axis and the Voronoi cells (which can be found in linear
time given DT(P )), we can decide (b, 1)-CLOSENESS in linear time from DT (P ):
either they represent the sorted order of each xi , in which case the answer is easily
determined; or they do not, in which case the answer is yes.
Second, we report an example that was shown to us by Mark de Berg and that
indicates that the factor of log log r in Theorem 5.7 cannot be improved.
Theorem 5.9 For any n there exists a set R of n disjoint 0.1-thick regions such that
it takes (n log log r) time to find the Delaunay triangulations for a point set with
exactly one point from each region in R, in the algebraic computation tree model.
Here, r is the ratio between the largest and smallest region in R.
Proof Consider the situation in Fig. 7. For k = 1, . . . , n, region Ri consists of a
2k 2k square extended by a line segment of length 2k . Clearly, all the regions Ri
are 0.1-thick, and the ratio r is ( 2n). Now, if the points in the input all lie in the
region bounded by the dashed rectangle, we only know that the inputs are contained
in equally spaced line segments. For this case, the lower bound by Djidjev and
Lingas [18] that we countered in the proof of Theorem 5.8 implies that an input needs
(n log n) = (n log log r) time in the worst case to be processed, as claimed.
6 Higher Dimensions
Finally, we discuss how our results generalize to higher dimensions. Of course we
cannot expect linear worst-case processing time, since in general the complexity of a
d -dimensional Delaunay triangulation might be (n d/2 ). However, in many
typical situations this worst-case behavior does not occur, and we therefore express the
Fig. 7 A lower bound for thick regions
processing time in terms of the expected structural change C(P ) of a randomized
incremental construction of DT(P ). The generalization of Theorem 1.2 to higher
dimensions says that DT(P ) can be computed from the quadtree in time O(C(P )).
Again, we need to extend the theorem to skewed quadtrees, which works just like in
the planar case (Appendix B). Now, all of the realistic input models above are defined
for any dimension. Consider a sequence of n traceable regions R in any fixed
dimension with linear-size -strong-guarding set Q. We can find a -deflated quadtree T
for Q in O(n log n) time. To process an input we can again use T to obtain a skewed
quadtree for the points in O(n log ) time. Adding the time for constructing the
Delaunay triangulation from the quadtree, the total expected time for a point set P is
O(n log + C(P )). Thus, while the techniques generalize to higher dimensions, the
approach is only useful if C(P ) is expected to be linear or near-linear.
7 Conclusions
We give an alternative proof of the result by Lffler and Snoeyink [29] with a much
simpler, albeit randomized, algorithm that avoids heavy machinery. For our simplified
approach, we need randomization only when we apply Theorem 1.1 to remove the
scaffold, and finding a deterministic algorithm for hereditary Delaunay triangulations
remains an intriguing open problem.
Using quadtrees, we also obtain optimal results for overlapping disks of
different sizes and fat regions. Furthermore, we are able to leverage known facts about
guarding sets to handle many other realistic input models. Our techniques seem quite
specific to Delaunay triangulations, and it is an interesting question whether similar
results can be proven for other geometric structures.
Acknowledgements The results in Sect. 2 were obtained at the NICTA Workshop on Computational
Geometry for Imprecise Data, December 1822, 2008 at the NICTA University of Sydney Campus. We
would like to thank the other workshop participants, namely Hee-Kap Ahn, Sang Won Bae, Dan Chen,
Otfried Cheong, Joachim Gudmundsson, Allan Jrgensen, Stefan Langerman, Marc Scherfenberg, Michiel
Smid, Tasos Viglas and Thomas Wolle, for helpful discussions and for providing a stimulating working
environment. We would also like to thank David Eppstein for answering our questions about [8] and
pointing us to [9]. We would like to thank Mark de Berg for discussing realistic input models with us and
for providing Theorem 5.9. Finally, we would like to thank an anonymous reviewer for valuable comments
that helped improve the readability of the paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution
Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium,
provided the original author(s) and source are credited.
Appendix A: Computing Compressed Quadtrees in O(n log n) Time
It is well known that given a planar n-point set P , we can compute a compressed
quadtree T for P in O(n log n) time. However, the details of the construction are
a bit involved [19], and for the readers convenience we describe here a possible
implementation that achieves the claimed running time on a pointer machine. The
basic algorithm is as follows: sort the points in x- and y-direction, find a bounding
box for P , and repeatedly split the boxes in the obvious way. If after c splits of the
current box the size of the point set P inside it has not decreased, we find a bounding
box for P , create a cluster node and recursively compute a quadtree for P .
In order to perform the splitting efficiently, we need to maintain the x- and
yorderings of the points contained in the current box. To do this, we do the splitting in
two steps: first in x-direction, then in y-direction. To split in x-direction, we traverse
the x-list simultaneously from both ends to find the split sets in time proportional to
the size of the smaller set. Let P1 and P2 be the two resulting sets, and let n1 and n2 be
their sizes, with n1 n2. Using appropriate pointers, we find the points of P1 in the
y-list and remove them. Now we need to create the sorted y-list for P1: if n1 n12/2,
we just sort P1 according to y-coordinate in O(n1 log n1) time. Otherwise, we use a
pointer-based radix sort for P1 that takes O(n1) time. For this, we need to maintain
an appropriate data structure with each of the y-lists.5 This data structure is created
when a y-list is split off and updated every time the size of a y-list halves, which
leads to a linear time overhead. The total time needed for the splitting in x-direction
obeys the recursion
T (n) =
T (n1) + T (n2) + O(n1 log n1), if n1 n1/2
2 ,
T (n1) + T (n2) + O(n1), if n12/2 n1 n2.
This solves to T (n) = O(n log n). The splitting in y-direction is done in the same
way, and the total time needed for the construction of the quadtree is O(n log n), as
claimed.
Appendix B: From Quadtrees to Delaunay Triangulations
Theorem 1.2 is proven in [11]. However, because of a technicality that is described
in Sect. 4, we need a slight generalization of the theorem. Therefore, here we first
briefly sketch the structure of that proof, and then show how it can be adapted to our
needs. The proof is based on a chain of reductions from Delaunay triangulations to
nearest-neighbor graphs to well-separated pair decompositions to quadtrees.
The authors show that to compute DT(P ), it suffices to find the nearest
neighbor graphs NN(P1), . . . , NN(Pt ) for each level of an appropriate gradation =
P0 P1 Pt = P with ti=0 |Pi | = O(n). By a classic reduction [12],
NN(P1), . . . , NN(Pt ) can be found in linear time, once we know -well-separated
pair decompositions (-WSPDs) for P1, . . . , Pt of linear size and for appropriate .
These -WSPDs can in turn be derived from compressed quadtrees for these sets. By
assumption of the theorem, we have a quadtree T for P = Pt , and by successively
pruning T , we can get quadtrees for P1, . . . , Pt1 in linear time.
Now we come to the adaptation of the proof. In Sect. 4, we defined a skewed
quadtree as a standard compressed quadtree, but one where clusters can be shifted
relative to their parents and parts of the cluster cells might be clipped. We need to
prove that Theorem 1.2 still holds in the case where we are given a skewed quadtree
for P , rather than a regular quadtree. Note that all steps outlined above can go through
unchanged, except that now we need to go from skewed quadtrees to -WSPDs. For
this, we take the algorithm that can compute an -WSPD based on a standard
compressed quadtree [12, 13] and check that it still works if the quadtree is skewed. We
make a key observation about skewed quadtrees first.
Observation B.1 Let v be a node of a skewed quadtree T , and let d be the size its
box would have without clipping. Then there is a box B adjacent to Bv (possibly
diagonally) such that the volume of B is at least cd2 for some constant c.
Also, recall the definition of an -WSPD for a point set P . It is a collection of
pairs {(P1, Q1), . . . , (Pm, Qm)} with Pi , Qi P such that each pair (Pi , Qi ) is
well-separated, which means that the diameters of Pi and Qi are both at most
times the minimum distance between Pi and Qi . Furthermore, we require that every
pair (p, q) P P of distinct points is in Pi Qi or Qi Pi for exactly one i. We
call m the size of the -WSPD.
Now we are ready to follow the argument. Let T be a skewed quadtree for P , and
let us see how to find a well-separated pair decomposition for P of linear size in time
O(|P |), given T . Our presentation follows Chan [13], with some small changes to
adapt the argument to skewed quadtrees. The WSPD is computed by performing the
function wspd(v) on the root v of T . Refer to Algorithm 2.
Here, Pv denotes the points contained in Bv , the box corresponding to v, and |Bv|
is the size of Bv . Clearly, wspd computes an -WSPD for P . We need to argue that
it has linear size and that the computation takes linear time. For this, it suffices to
count the number of calls to wspd(v1, v2) such that Bv1 and Bv2 are not
-wellseparated. By induction, we see that whenever wspd(v1, v2) is called, the size of the
box corresponding to the parent of v1 is at least |Bv2 | and the size of the parent box
for v2 is at least |Bv1 |. Without loss of generality, we assume |Bv1 | |Bv2 |, and let
r be the parent of v1. As we noted above, we have |Br | |Bv2 |, and by successively
subdividing Br in a quadtree fashion (and shifting if necessary), we obtain a box B
such that Bv1 B Br , and such that |Bv2 |/2 |B| |Bv2 |. Since B and Bv2 are
Algorithm 2 Finding a well-separated pair decomposition
1. If v is a leaf, return .
2. Return the union of wspd(r ) and wspd(r1, r2) for all children r and pairs of distinct children r1, r2 of v.
v2.
3. Otherwise, return the union of wspd(r, v2) for all children r of v1.
not -well-separated, B must be within distance O(|B|/) of Bv2 . To see that there
are only constantly many such boxes, we can use a volume argument, but we need
to be careful about boxes which are clipped because they are contained in a cluster
that is shifted relative to its parent. However, by Observation B.1, every clipped box
has an adjacent box which is only cut by a constant fraction c (if at all). Therefore,
the number of boxes that are clipped too much can be at most 8 times the number
of boxes that are clipped by at most c, so the total number of nearby boxes is still
constant. Hence, the total size of the WSPD is linear.