Clustering Motion

Discrete & Computational Geometry, Mar 2004

Sariel Har-Peled

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs00454-004-2822-7.pdf

Clustering Motion

Discrete Comput Geom Geometry Discrete & Computational Clustering Motion 0 1 Sariel Har-Peled 0 Through the Looking-Glass , Lewis Carroll 1 Department of Computer Science , DCL 2111 , University of Illinois , 1304 West Springfield Avenue, Urbana, IL 61801 , USA Given a set of moving points in Rd , we show how to cluster them in advance, using a small number of clusters, so that at any time this static clustering is competitive with the optimal k-center clustering at that time. The advantage of this approach is that it avoids updating the clustering as time passes. We also show how to maintain this static clustering efficiently under insertions and deletions. To implement this static clustering efficiently, we describe a simple technique for speeding up clustering algorithms and apply it to achieve faster clustering algorithms for several problems. In particular, we present a linear time algorithm for computing a 2-approximation to the k-center clustering of a set of n points in Rd . This slightly improves the algorithm of Feder and Greene, that runs in (n log k) time (which is optimal in the algebraic decision tree model). ∗ A preliminary version of the paper appeared in Proceedings of the 42nd Annual IEEE Symposium on the Foundations of Computer Science, pages 84-93, 2001. 1. Introduction Clustering is a central problem in computer science. It is related to unsupervised learning, classification, databases, spatial range-searching, data-mining, etc. As such, it has received much attention in the last 20 years. There is a large literature on this topic with numerous variants, see [BE] and [DHS]. k-Center Clustering. One of the most natural definitions of clustering is the min-max radius clustering or k-center clustering. Here, given a set of n points in some metric space, one wishes to find k special points, called centers, such that the maximum distance of a point to a center is minimized. In the continuous k-center clustering, the centers can be located everywhere in the underlying space, and in the second variant, known as the discrete k-center, the k centers must be input points. An alternative interpretation of k-center clustering is that we would like to cover the points by k balls, where the radius of the largest ball is minimized. Intuitively, if such a tight clustering exists, it provides a partition of the point set into k classes, where the points inside each class are “similar.” There is a very simple and natural algorithm that achieves a 2-approximation for the discrete k-center clustering [G]: Repeatedly pick the point furthest away from the current set of centers as the next center to be added. This algorithm can be easily implemented in O (nk) time [G], and, in fact, it is also a 2-approximation for the continuous case. Feder and Greene [FG] showed that if the points are taken from Rd , one can compute 2-approximation to the optimal k-center clustering in (n log k), by a different implementation of the greedy algorithm mentioned above, and this algorithm is optimal in the algebraic decision tree model. They also showed that computing a c-approximation is NP-hard for all c ≤ 1.822. Clustering Motion. Let P [t ] be a set of n moving points in Rd , with a degree of motion µ ; namely, for a point p[t ] ∈ P [t ], we have p(t ) = ( p1(t ), . . . , pd (t )), where pj (t ) is a polynomial of degree µ , and t is the time parameter, for j = 1, . . . , d. If one wishes to answer spatial queries of the moving point set P [t ], one needs to construct a data structure for that purpose. Most such data structures for stationary points rely on space partition schemes, and while such partitions and clusterings are well understood for the case of stationary points, for the case of moving points considerably less is known (see [DG], [AAE], [HV1], [GGH+], and [AH] for recent relevant results). The difficulty in maintaining and computing such clustering is the one underlying most kinetic data structures [BGH]: Once the clusters are computed at a certain time, and we let time progress, the clustering changes and deteriorates. To remain a competitive clustering (i.e., the cluster sizes are small compared with the size of the optimal clustering), one needs to maintain the clustering by either reclustering the points every once in a while, or, alternatively, move points from one cluster to another. The number of such “maintenance” events dominates the overall running time of the algorithm, and usually the number of such events is prohibitively large. For example, it is easy to verify that a kinetic clustering of n linearly moving points must handle (n2) events in the worst case to remain competitive (up to any factor) with the optimal k-center clustering at any given time. See Fig. 1. (However, there are indications that such worst-case inputs might not be that common in practice [BGSZ], and that they do not happen if you use random inputs [BDIZ].) Our Results. We demonstrate, in Section 2, that if one is willing to compromise on the number of clusters used, then clustering becomes considerably easier (computationally) and it can be done quickly. Furthermore, we can trade off between the quality of the clustering and the number of clusters used. As such, one can compute quickly a clustering with a large number of clusters, and cluster those clusters in a second stage, to get a reasonable k-clustering. This relies on the observation that clustering a random sample results in a partial clustering of almost all input points (this is well known, see [VC], [ADPR], and [MOP]). Thus, by clustering the remaining uncovered points, we can get a clustering of the whole point set. Furthermore, by using a point-location data structure, one can quickly find out the points not covered by the clustering of the random sample. Putting this together results in a generic technique for speeding up clustering. The new approach can be viewed as a reduction of geometric clustering to the task of carrying out point-location queries quickly, together with approximate clustering [I], [ADPR], [MOP] and sampling techniques [VC], [HW]. We provide several applications of the new technique. This naturally raises the question of whether one can get a similar tradeoff between the computational cost of maintaining a k-center clustering for moving points, and the number of centers used. In particular, in Section 3 we show that, for a point set moving with polynomial motion of degree µ , if one is allowed to use kµ +1 clusters (instead of k), then one can compute a clustering which is (constant factor) competitive with the optimal k-clustering of the point set at any time, where µ is the algebraic degree of motion of the points. This clustering is static and thus not required to handle any event. We also show that static clustering requires using kµ +1 clusters in the worst case. In Section 4 we present an algorithm to compute this clustering in O (nk) time. In Section 5, an algorithm for picking a “small” subset of the moving points is described. This is done by computing a very fine clustering, and picking a representative from each cluster. The size of this subset, known as coreset, is independent of n, and it represents the k-center clustering of the moving points at any time, up to a factor of 1 + ε. Namely, instead of clustering the points, we can cluster the representative points. This implies, that one can construct a data structure that can report the approximate clustering at any time, and performs insertions, deletions, and motion updates in poly(k, 1/ε, log n) time for all of those operations, where poly(·) denote a fixed-degree polynomial in the input parameters. Note that this performance is quite attractive as clustering can change completely even if only a few points change their motion, requiring a prohibitive price if one were to just kinetize the algorithm of Gonzalez [G] naively. Finally, in Section 6, we show how one can implement the greedy clustering algorithm of Gonzalez [G] in expected linear time, for all k = O (n1/3/log n). (In fact, all the expected time bounds in our paper also hold with high probability.) Our algorithm relies on a considerably stronger computation model than Feder and Greene [FG]. In particular, the algorithm uses constant-time hashing, the floor function, and randomization. Concluding remarks are given in Section 7. 2. Fast Clustering Using Point-Location Given a set P of n points in Rd , we are interested in covering them with a set C ⊂ R of k objects (i.e., clusters) that minimizes the target function r(C) = maxc∈C r(c), where r(c) is a function that quantifies the size of the cluster, and R is the set of all permissible clusters. For example, in the case of the k-center problem, the clusters are balls, and we wish to minimize the maximum radius of the k balls used to cover the points. Let Copt( P, k) denote the optimal continuous k-clustering of P (i.e., the centers of the clusters can be any point of Rd ), and let ropt( P, k) = r(Copt( P, k)) denote the radius of this clustering. Let A = ( P, R) denote the underlying clustering space. For example, for the continuous k-center clustering, R is the set of all balls in Rd . Definition 2.1. A set of clusters C is a partial k-clustering, or just partial clustering, of a point set P if it covers some of the points of P and |C| = k. It is a k-clustering if it covers all the points of P . Let c be a cluster in a partial clustering C of P . We replace c with the canonical cluster c ; that is, the cluster that realizes minc ∈R,c∩P⊆c r(c ). For k-center clustering, the canonical cluster of a ball c is the smallest ball that contains P ∩ c. In the following we assume that we work with canonical clusters. In geometric settings a cluster c cannot shrink any further (i.e., it is canonical) because it is “locked” in place by a set of points of P , and in fact c is defined by this set of points. (Imagine a ball that cannot shrink any further—it contains at most d + 1 points of P on its boundary, and this is the smallest ball that contains this set of points.) Let D(c ) be this set of points, called the defining set of c . In the following we assume that every defining set defines only a constant number of different clusters. Let µ = dim(A) = maxc∈R| D(c)| be the dimension of the space A = ( P, R). (This is a restricted variant of the concept of VC-dimension, and in fact all our arguments can be carried out using VC-dimension. However, avoiding VC-dimension arguments slightly simplifies the presentation.) A canonical k-clustering is a clustering of P by k clusters which are all canonical. Example 2.2. Consider the clustering space for the discrete k-center clustering. Here, a canonical cluster c is defined by its center p ∈ P , and the point furthest away from the center q ∈ P (i.e., the length of pq is the radius of c). Thus, a canonical cluster in this setting is defined by a set of two points { p, q}. However, this set corresponds to another canonical cluster, the one that has q as its center, and its radius is the length of pq. Observe that the size of the defining set in the discrete case is independent of the dimension of the underlying space Rd . Lemma 2.3. If the dimension of A = ( P, R) is µ = dim(A), then the number of partial canonical k-clusterings of P is at most cρnµ k , where n = | P|, c is a constant, and ρ = ρ(A) is a constant that depends on the underlying space A. Proof. The number of different defining sets of A is at most iµ=0 ni = O(nµ ). Each such set defines at most a constant number ρ = ρ(A) of canonical clusters (usually one, but this constant depends on the space A, see Example 2.2). Thus, the number of canonical clusters is U = O(ρnµ ). Thus, a canonical k-clustering is defined by k such clusters. It follows that the number of canonical k-clustering of P is ≤ U k = O(nµ k ). Definition 2.4. For a partial clustering C, let C denote the region covered by the clusters of C. For a parameter ε, a partial canonical clustering C is ε-good for P if the number of points of P that are not covered by C is no larger than εn. Formally, C is ε-good if | P\ C| < εn. The following lemma shows that if one clusters a large enough sample of points from P, then this clustering covers almost all the points of P. A similar observation (in slightly different form) was originally made in [ADPR] and [MOP] (in essence, this observation can be traced back to the work of Vapnik and Chervonenkis [VC]), and we include it here for the sake of completeness. Lemma 2.5. Let S be a set of points computed by picking randomly and uniformly m points (with repetition) from P. If m ≥ log ρ + µ k log n + β log n ε then any canonical k-clustering of S is ε-good with probability at least 1 − n−β , where µ = dim(A), ρ = ρ(A), A = ( P, R), and ε, β are parameters. Proof. Let C be a canonical partial k-clustering of P which is not ε-good. If S contains a point that C does not cover, then C cannot be a clustering of S. However, we have | P ∩ C | | P| m Pr S ⊆ C ≤ ≤ (1 − ε)m ≤ e−εm . However, by Lemma 2.3, the number of different canonical k-clusterings of P is at most ρnµ k . Thus, the probability that any canonical partial k-clustering of P which is not ε-good to be a clustering of S, is bounded by ρnµ k e−εm . In particular, if we want this probability to be smaller than n−β , then we have log ρ + µ k log n − εm ≤ −β log n. Solving for m, we have m ≥ (log ρ + µ k log n + β log n)/ε. In the following we are interested in problems where the clustering price is monotone; that is, r( P , k) ≤ r( P, k) for P ⊆ P. If we wish to perform k-clustering (or approximate k-clustering) using Lemma 2.5, and we are willing to compromise on the number of Algorithm FastCluster( P, k, ε) P is a set of n points, k is number of clusters, ε > 0 is a parameter (i) Compute a sample S ⊆ P of size O ((k log n)/ε). (ii) Compute an (approximately) optimal k-clustering C of S. (iii) Compute P = P \ C. (iv) Compute an (approximately) optimal k-clustering C of P . (v) Return C ∪ C . clusters used, then we can use the algorithm FastCluster described in Fig. 2. The advantage of FastCluster is that it calls twice for a clustering algorithm, but in both cases the sets used are small (one of size O ((k log n)/ε), and the other of size εn). In fact, FastCluster uses two subroutines: (a) given a (small) subset of the points, compute k-clustering of them (stages (ii) and (iv)), and (b) given a clustering, and a set of points, compute all the points that lie outside the clustering (stage (iii)). Definition 2.6. A point-location data structure is a data structure that receives a set C of u clusters as input and preprocesses them in TP (u) time, so that given a point p it can decide in O (log u) time whether a point lies inside C. Using such a point-location data structure it is now possible to compute P in stage (iii) of FastCluster in O (n log k + TP (k)) time. Indeed, we preprocess C, the set of k clusters computed in stage (ii), for point-location queries. This takes TP (k) time. Then, for each point of P , we can decide in O (log k) time whether it is contained in C. Definition 2.7. A clustering C is a (c, m, k)-clustering of P if C is made out of m clusters, and C is c-competitive with the optimal k-clustering; namely, r(C) ≤ c · ropt( P, k). Theorem 2.8. Assume that we have a clustering algorithm that, given a set of m points, can compute a (α, ϕ(k), k)-clustering in TC (m) time, where α and ϕ(·) (the number of clusters generated) depend on the algorithm at hand. Using FastCluster, one can compute an (α, 2ϕ(k), k)-clustering of P in O TC The algorithm FastCluster can be viewed as two-round clustering, and it can be extended to η-round clustering. Indeed, we cluster a sample from the points which are not currently covered, and remove all the points which are covered, by using a point-location data structure, and repeat this η times. Finally, we cluster the remaining points. The advantage of such an η-round approach is that we can use considerably smaller samples. The disadvantage is that instead of getting (α, 2ϕ(k), k)-clustering we get (α, ηϕ(k), k)-clustering. We summarize the result in the following: Theorem 2.9. Using FastCluster, one can compute an (α, ηϕ(k), k)-clustering of P in O ηTC + TC (εηn) + ηTP (ϕ(k)) + ηn log k expected time, where ε > 0 and η are parameters that can be fine-tuned in advance to optimize the running time. FastCluster needs a subroutine for performing (α, ϕ(k), k)-clustering. However, this can be replaced by an algorithm that extracts a single cluster which has radius smaller than the optimal clustering, but contains at least (say) n/4k of the points. Such a subroutine can be easily implemented using random sampling, and a brute-force search over all possible canonical clusters of the sample. Using point-location data structures this results in a greedy clustering with O(k log n) clusters, and in running time O(n log k) (ignoring terms which are polynomial in log n and k). Although this seems to be a stronger algorithm than FastCluster, as it requires a weaker subroutine (i.e., a subroutine to extract one heavy cluster, instead of a subroutine for performing k-clustering), we are currently unaware of a case where this results in a faster algorithm. For an example of how to use Theorem 2.9, consider the following clustering problem: We are given a set P of n points in the plane, and we would like to cover them with k minimum max-width strips. It is known that this problem is NP-complete to approximate within any constant factor [MT]. The fastest approximation algorithm currently known is due to Agarwal and Procopiuc [AP]. It runs in O(nk2 log4 n) time and covers the points with O(k log k) strips of width at most six times the optimal clustering using k strips. Using Theorem 2.8 we get the following improved result. Lemma 2.10. Given a set of n points in the plane, one can compute a clustering of P by O(k log k) strips of width at most ropt( P, k), where ropt( P, k) denotes the minimal width cover of P by k strips, and k ≤ n1/6. The expected running time is O(n log k). Proof. Note that a canonical strip is defined by three points of P, and the dimension of the space ( P, R) is µ = 3, where R is the set of all strips in the plane. We set sεiz=e m√=k/nO, (a√ndnkulsoegtnh)e. aTlhgeoirritahlmgoorifthAmgawrworaklsainndTPCr(omc)op=iucO[(AmPk]2 wloigth4 ma )satimmpele(foorf k2 log k ≤ m). Note that the algorithm of Agarwal and Procopiuc [AP] returns a clustering by O(k log k) strips of width at most 6ropt( P, k), and by splitting each strip into six equal strips we get the required clustering by O(k log k) strips of width at most ropt( P, k). Given those strips, preprocessing those strips for point-location takes TP (k log k) = O(k2 log2 k) time using standard techniques [dBvKOS]. Overall, the algorithm computes a clustering by O(k log k) strips, with width at most ropt( P, k), and the running time is O TC 3. Static Clustering of Moving Points For a moving point set P in Rd , let ropt( P[t ], k) denote the radius of the optimal (continuous) k-center clustering of P[t ], where P[t ] is the set of points of P at time t . Formally, ropt( P[t ], k) = Pm[t]⊆i∪nC r(C), C∈Bk where B is the set of all balls in Rd , and r(C) = maxc∈C r(c). In the following we omit P and k when they can be understood from the context. Let tmin = tmin( P, k) be the time when ropt(t, k) is minimized, and let rmin = rmin( P, k) = ropt( P[tmin], k). In the following we assume that the points of P[t ] are moving according to a polynomial motion of degree µ . Namely, the position of a point p[t ] ∈ P[t ] is defined at time t to be ( p1(t ), . . . , pd (t )), where p1(·), . . . , pd (·) are polynomials of degree at most µ . Definition 3.1. Let P be a moving point set as above. A partition of P into m sets U = {U1, . . . , Um } is a (c, m, k)-static clustering if r(U [t ]) = max r(Ui [t ]) ≤ c · ropt( P[t ], k), i for any t ∈ R, where r(Ui [t ]) is the radius of the smallest ball that contains the points of Ui [t ]. In this section we prove the existence of a small (c, m, k)-static clustering for points with a bounded degree motion, where c and m depend on the dimension and the degree of motion of the points. To appreciate the above result, note that the motion of the points changes the distances between the points continuously. Points which are at a certain time far, become close, and far again. Although the motion is not chaotic, the underlying structure of the optimal clustering of the moving points is unstable, changing discretely (and considerably) even if points move only a short distance. See Fig. 1. We map each moving point p[t ] = ( p1(t ), . . . , pd (t )) to a curve γ = t ( p1(t ), . . . , pd (t ), t ) in d + 1 dimensions, so that the first d coordinates of the intersection point of γ with the axis-parallel hyperplane xd+1 = t is the point p[t ]. Abusing notation, we refer interchangeably to P as a set of curves in Rd+1 or as a set of moving points in Rd . In the following we apply a sequence of transformations to each point of P. As we transform a point, we remember its index in the original set. Lemma 3.2. Let P be a set of curves resulting by translating each curve of P horizontally by a vector of length ≤ δ = ρ rmin( P, k), where ρ ≤ 1. Then, for any (c, m, k)static clustering of P , the corresponding clustering of P is ((1 + c)ρ + c, m, k)-static clustering. Proof. Let U = {U1, . . . , Um } be a (c, m, k)-static clustering of P , and let U = {U1, . . . , Um } be the corresponding clustering of P . We have by definition that r(U [t ]) = Um∈aUx r(U [t ]) ≤ cropt( P [t ], k). Thus, r(U [t ]) ≤ r(U [t ]) + δ ≤ c · ropt( P [t ], k) + δ ≤ c(ropt( P [t ], k) + δ) + δ = c · ropt( P [t ], k) + (c + 1)δ ≤ c · ropt( P [t ], k) + (c + 1)ρ · ropt( P [t ], k) ≤ ((1 + c)ρ + c)ropt( P [t ], k). Definition 3.3. For a ball B in Rd and x ∈ R, we refer to the set B × {x } ⊆ Rd+1 as a horizontal disk. For a moving point set P in Rd , a set D of horizontal disks D1, . . . , Dm in Rd+1 is a (c, m, k)-stabbing of P if each of the curves of P passes through one of the disks of D, and r(D) = maxD∈D r( D) ≤ c rmin( P, k). Similarly, a set of points X ⊆ Rd+1 stabs P if each curve of P contains at least one point of X . Example 3.4. For a moving point set P , let D = { D1, . . . , Dk } denote the set of k horizontal d -dimensional disks in Rd realizing the optimal k-clustering of P at time tmin( P, k). Clearly, D is a (1, k, k)-stabbing of P . Lemma 3.5. Let P be a set of moving points in Rd , where the algebraic degree of motion is µ , and let D be a (ρ , m, k)-stabbing of P . Then one can translate each curve of P horizontally by a vector of length at most δ = ρ rmin( P ), and ropt( P [t ], k) ≤ ropt( P [t ], k) + δ ≤ ropt( P [t ], k) + 2δ, P is stabbed by a set of points X ⊆ Rd , and | X | = m. Moreover, for any (c, m, k)-static clustering of P , the corresponding clustering of P is ((1 + c)ρ + c, m, k)-static clustering. Proof. Let D1, . . . , Dm denote the m horizontal d -dimensional disks in D. Let A1, . . . , Ak denote the partition of P into the sets as induced by D1, . . . , Dm . Namely, p is in Ai if p stabs Di . If a moving point (i.e., a curve) is stabbing several such disks, we assign it to one of the candidate sets arbitrarily. Let ξi denote the center of Di . Let Bi denote the set of curves resulting from translating each curve of Ai horizontally, so that it passes through ξi . Let P = i Bi . Observe that ropt( P [t ], k) ≤ ropt( P [t ], k) + δ and ropt( P [t ], k) ≤ ropt( P [t ], k) + δ, as the distance between a point in P and its translated image in P is at most δ = rmin( P ) at any time. A natural interpretation of Lemma 3.5 is that it partitions the set of curves P into m families, where each family of curves passes through a common point. Let P be a set of moving points in Rd , all passing through a common point ξ at time t . We can consider P [t ] to be a set of curves in Rd+1, and we are interested in finding a set of horizontal disks that stabs all the curves. Note that this measure in insensitive to translation. In particular, we can translate the curves of P so that (ξ , t ) is mapped to the origin. Let P denote the resulting point set. Clearly, P and P are equivalent, and one can perform the clustering for P instead of P . Furthermore, for t = 0 all the points of P are located at the origin. Lemma 3.6. Let P be a set of moving points in Rd with algebraic degree of motion µ , such that all of them are at the origin at time t = 0. Then there exists a mapping f (·) of a point of P [t ] into a moving point of polynomial motion of degree µ − 1, so that for any A ⊆ P , we have r( f ( A[t ])) = r( A[t ])/t . In particular, given a (c, m, k)-static clustering U1, . . . , Um of f ( P ), then f −1(U1), . . . , f −1(Um ) is a static (c, m, k)-static clustering of P . Proof. Let p[t ] = ( p1(t ), p2(t ), . . . , pd (t )) be any point of P [t ]. We have p[0] = (0, 0, . . . , 0) by assumption. In particular, p1(0) = · · · = pd (0) = 0. Thus, all the polynomials p1(t ), . . . , pd (t ) have a constant term which is zero. Namely, pi (t )/t is a polynomial of degree at most µ − 1. Let f ( p) = ( p1(t )/t , . . . , pd (t )/t ). Let Q = f ( P ). By the above, Q is a set of points moving with motion of degree at most µ − 1. For any p, q ∈ P , we have p[t ]q[t ] = f ( p[t ]) f (q[t ]) · t . In particular, for any A ⊆ P , we have r( A[t ]) = r( f ( A[t ])) · t . In particular, ropt( P [t ], k) = ropt(Q[t ], k) · t . Thus, given a (c, m, k)-static clustering of Q by U1, . . . , Um , we then have r( f −1(Ui [t ])) = r(Ui [t ]) · t ≤ c · t · ropt(Q[t ], k) = c · ropt( P [t ], k), for all t ∈ R. of P . Lemma 3.7. Let P be a set of moving points in Rd , where the degree of motion is µ . There exists a partition of P [t ] into m = kµ +1 sets P1, . . . , Pm , so that r( Pi [t ]) ≤ (2µ +1 − 1)ropt( P [t ], k). Namely, there exists a (2µ +1 − 1, kµ +1, k)-static clustering Proof. The proof is by induction on µ . For µ = 0, the points are stationary, and as such the claim trivially holds as the clustering is independent of time. In particular, we have for µ = 0 an (α(0), ϕ(0), k)-static clustering, where α(0) = 1, and ϕ(0) = k. For µ > 0, by Example 3.4, there exists a (1, k, k)-stabbing of P and by Lemma 3.5 one can map the points of P (by translating each one of them horizontally by a vector of length ≤ rmin( P, k)) into a set P so that the curves of P are being stabbed by a set of k points (i.e., the centers of the horizontal disks of the (1, k, k)-stabbing). Let A1, . . . , Ak be the partition of P into sets, so that each set passes through a single common point. By Lemma 3.6 there is a mapping of each Ai into a set of moving points Bi , so that the degree of motion of Bi is µ − 1. By induction each Bi has an (α(µ − 1), ϕ(µ − 1), k)-static clustering. By Lemma 3.6 the corresponding partition is also an (α(µ − 1), ϕ(µ − 1), k)static clustering of Ai . In particular, since ropt(Ai [t ], k) ≤ ropt( P [t ], k), we have that the clustering resulting from doing this clustering process to each Ai separately, results in an (α(µ − 1), kϕ(µ − 1), k) static clustering of P . However, by Lemma 3.5, such a clustering corresponds to a (2α(µ − 1) + 1, k · ϕ(µ − 1), k)-static clustering of P . It is now follows that α(µ) = 2α(µ − 1) + 1 = 2 · (2µ − 1) + 1 = 2µ +1 − 1, and ϕ(µ) = kµ +1. In essence, Lemma 3.7 reduces a rather obnoxious clustering problem (clustering of moving points) into a somewhat strange but conventional clustering problem. Somewhat surprisingly, the kµ +1 term in Lemma 3.7 is tight if one is interested in a constant factor approximation. Indeed, let γi, j be the line passing through the points (i, 0) and ( j, 1), for some 1 ≤ i, j ≤ k, and let L denote the set of all such lines. This corresponds to a set of linearly moving points in R (i.e., the y-axis is the time axis). Note that ropt(L [0], k) = ropt(L [1], k) = 0. Let L 1, . . . , L m be any static clustering of L with m < k2. One of those static clusters, L j , must have at least two lines of L . In particular, max(r(L j (0)), r(L j ( 1 ))) ≥ 1 which implies that this is unbounded error compared with ropt(L (0)) or ropt(L ( 1 )) which are both zero. This example can be easily extended to a point set with polynomial motion µ , showing that kµ +1 static clusters are necessary. No static clustering can be competitive if insertions and deletions are allowed, even for non-moving points. Indeed, let us assume that we are given points in a online fashion. Once the point has been received, we have to assign it immediately to one of the static clusters c1, . . . , cm , so that at any time this clustering is competitive with the optimal k-clustering. To see why such an algorithm is not possible, let the sequence of points (on the real line) be xi = (3 + i )i , for i = 1, . . . , m + 1. Note that the optimal k-clustering for Pi = {x1, . . . , xi } has xi as a singleton in the clustering (since the distance between xi and xi−1 is larger than the distance between xi−1 and x1). In particular, x1, . . . , xm+1 must all be in disjoint clusters, but we are restricted to m clusters—a contradiction. Using the same construction in the other direction shows that no static clustering can support deletions. Interestingly enough, one can combine the static clustering together with the kinetic data structure approach so one can support insertions and deletions. See Section 5. The case where only insertions (and mergings) are allowed for the stationary case was investigated by Charikar et al. [CCFM]. One can extend Lemma 3.7 by using tighter stabbings. Theorem 3.8. Let A be an algorithm that computes a (c, m, k)-stabbing of a moving point set. Then, for a moving point set P , with a degree of motion µ , one can compute a ((c + 1)µ +1 − 1, mµ +1, k)-static clustering of P . Proof. The proof follows the same inductive argument of Lemma 3.7. For µ = 0, the points are stationary, and A provides a (c, m, k)-static clustering of P . Let α(0) = c and ϕ(0) = m. = mµ +1, and For µ > 0, we compute, using A, a (c, m, k)-stabbing of P, and we move each curve of P to the center of its stabbing disk. This partitions the resulting point set P into m sets: A1, . . . , Am . By Lemma 3.6 we can treat each of the Ai as having degree of motion µ −1. By induction, for each Ai one can compute an (α(µ − 1), ϕ(µ − 1), k)-static clustering. Combining those sets together results in an (α(µ − 1), ϕ(µ − 1)m, k)-static clustering of P . By Lemma 3.2, this is a ((1 + α(µ − 1))c + α(µ − 1), ϕ(µ − 1)m, k)-static clustering of P. Thus, ϕ(µ) α(µ) = (1 + α(µ − 1))c + α(µ − 1) = c + (c + 1)α(µ − 1) µ µ c(c + 1)i−1 + (c + 1)µ α(0) = c (c + 1)i−1 + (c + 1)µ = = c i=1 µ +1 Overall, this results in a ((c + 1)µ +1 − 1, mµ +1, k)-static clustering of P. Lemma 3.9. Given a (c, m, k)-stabbing D of a moving point set P in Rd , one can compute an (ε, O(m(c/ε)d ), k)-stabbing of P. Proof. Cover each disk of D by O(1/εd ) equal size disks (for example, by covering such a disk by a grid of size ≤ (ε/√d)rmin( P, k)), such that each one of them is of radius ≤ εrmin( P, k). The resulting set of disks is clearly an (ε, O(m/εd ), k)-stabbing of P. Thus, to compute static clustering of a moving point set, we need to be able to compute a (c, m, k)-stabbing quickly. It is unclear how one can compute quickly (even approximately) a (c, m, k)-stabbing of a moving set of points. Nevertheless, it is quite easy to come up with a “slow” algorithm for computing such stabbing (intuitively, you just enumerate all possible stabbings, and return the best one). Thus, using this together with the speedup technique of Section 2 results in a reasonably fast algorithm for computing static clustering, as described in the following section. 4. A Motion-Clustering Algorithm In this section we describe how to compute the static clustering of Section 3 using the techniques of Section 2. Lemma 4.1. Given a set of n moving points P[t ] with algebraic degree of motion µ , and a center ξ [t ] ∈ P[t ], one can compute the time t when r( P[t ], ξ [t ]) is minimized, where r( P[t ], ξ [t ]) = maxp∈P p[t ]ξ [t ] . This takes O(λ2µ +2(n) log n) time, where λt (n) is the maximum length of a Davenport–Schinzel sequence of order t . Proof. Let fi (t ) = pi [t ]ξ [t ] , for i = 1, . . . , n. Clearly, finding the time when the disk centered at ξ [t ] has minimum radius and covers all the points of P[t ], is equivalent to finding the lowest point in the upper-envelope maxi fi (t ). This can be done in O(λ2µ +2(n) log n) time [SA]. Remark 4.2. Lemma 4.1 can be extended to find the optimal radius for a static clustering. Namely, we are given k sets P1[t ], . . . , Pk [t ], and respective centers ξ1[t ] ∈ P1[t ], . . . , ξk [t ] ∈ Pk [t ], and we wish to compute the time where the k disk’s centers at ξ1[t ], . . . , ξk [t ] cover their respective sets, and the maximum radius is minimized. Clearly, this boils down to finding the lowest point in an upper-envelope of n functions (assuming | P1| + · · · + | Pk | = n), and thus can be computed in O(λ2µ +2(n) log n) time. Definition 4.3. Given a set of points P[t ] in Rd , we associate with any pair of points ( p, q) ∈ P[t ] × P[t ] the distance between the two points. The signature of P[t ] at time t is the ordering of all those n2 pairs by their length. Lemma 4.4. Given a point set P[t ] whose signature is constant in the interval t ∈ [a, b], the partition induced by the greedy clustering [G] generated for any t ∈ [a, b] is the same, and provides a (2, k, k)-static clustering in this interval. In particular, one can compute a 2-approximation to the minimum of ropt( P[t ], k) on the interval [a, b] in O(nλ2µ +2(n) log n) time. Proof. Since the algorithm of Gonzalez [G] bases its decisions on comparing distances between pairs of points of P[t ], it is clear that if the signature does not change, then the resulting partition remains the same for all t ∈ [a, b]. This is a (2, k, k)-static clustering, because at any time t ∈ [a, b] the algorithm of Gonzalez [G] generates a 2-approximation to ropt( P[t ], k). Computing the minimum radius of this partition in [a, b] can be done in the time specified, using the algorithm of Remark 4.2. Lemma 4.5. One can compute O(n4) sorted intervals that cover the real-line, in O(n4 log n) time, so that inside each such interval P[t ] has the same signature. Proof. Consider a quadruple of points p1, p2, q1, q2 ∈ P. The equation p1[t ] p2[t ] = q1[t ]q2[t ] has 2µ solutions, and those are the times when the pair p1 p2 exchanges places with q1q2 in the signature. Repeating this for all possible quadruples results in all the times when the signature changes. Overall, there are O(n4) such events, and one can sort them in O(n4 log n) time. Lemma 4.6. Given a set of n moving points P in Rd with algebraic degree of motion µ , one can compute, in O(n4λ2µ +2(n) log n) time, a (2, k, k)-stabbing of P. Proof. Use the algorithm of Lemma 4.5 to compute all the intervals where the signature remains the same. Apply for each interval the algorithm of Lemma 4.4. Return the set of disks computed with minimum radius. Lemma 4.7. Given a set P of n moving points in Rd , with algebraic degree of motion µ , one can compute a (2, 15k, k)-stabbing of P, for all k = O(n1/14), in O(nk) time. Proof. We use Theorem 2.9, but for the sake of simplicity we use the naive pointlocation data structure (i.e., for each curve, we scan the disks and decide whether it stabs the disk or not). We set η = 14 and ε = (k/n)1/(η+1). The samples used by the algorithm are of size m = O((k log n)/ε) = O(n1/(1+η)kη/(1+η) log n). Using the algorithm of Lemma 4.6 such a sample can be 2-approximate k-clustered in TC (m) = O(m4λ2µ +2(m) log m) = O(m5.5) = O((n6k6η)1/(1+η)) = O((n6+6η/14)1/(1+η)) = O(n(84+6η)/14(1+η)) = O(n168/210) = O(n) time, as k = O(n1/14) and η = 14. Thus, the overall running time of the algorithm of Theorem 2.9 is O(nk + (η + 1)TC (m)) = O(nk), since we spend O(k) on each point-location query, and we perform O(ηn) point-location queries. Lemma 4.7 can be easily improved, by using more advanced data structures. However, since such data structures are rather complicated, we opted for a slower algorithm which is reasonably fast. Putting Lemma 4.7, together with Theorem 3.8, we have: Theorem 4.8. Let P be a set of moving points in Rd , where the algebraic degree of motion is µ . One can compute a (3µ +1 − 1, (15k)µ +1, k)-static clustering of P in O(nk) time, for k = O(n1/14). 5. Using Coresets to Perform Insertions and Deletions In this section we show how to maintain static clustering under insertions and deletions. Theorem 5.1. Let P be a set of moving points in Rd with algebraic degree of motion µ , aofndP0in<Oε(<nk/12εad )patirmaem, efoterr.aOllnke =canOc(onm1/p14uεted)a.n (ε, O((k/εd )µ +1), k)-static clustering Proof. We convert a (2, 15k, k)-stabbing of Lemma 4.7 into an (ε/c, O(k/εd ), k)stabbing of P, using Lemma 3.9, where c is a constant to be specified shortly. Using such a stabbing in the algorithm of Theorem 3.8 results in an (a, b, k)-static clustering U of P, where a = (1 + ε/c)µ +1 − 1 and b = O((k/εd )µ ). For c = 2(µ + 1), we have a = ε µ +1 1 + c − 1 ≤ exp ε 2(µ + 1) (µ + 1) ε − 1 ≤ 1 + 2 2 − 1 = ε, since 1 + x ≤ ex ≤ 1 + 2x for 0 ≤ x ≤ 21 . Thus, U is an (ε, O((k/εd )µ +1), k)-static clustering of P Theorem 5.1 implies that one can extract a small subset from P that represents the point set, up to factor of ε, as far as k-clustering at any time is concerned. Definition 5.2. Let P be a point set in Rd , and let 12 > ε > 0 be a parameter. A set Q ⊆ P is an ε-coreset of P if for any set of k balls C that covers Q, we have that P is covered by C(εropt( P, k)), where C(δ) = {ball( p, r + δ)|ball( p, r ) ∈ C}, where ball( p, r ) denotes the ball of radius r centered at p. If P is a set of moving points, then Q ⊆ P is an ε-coreset of P if Q [t ] is an ε-coreset of P [t ], for all t . For recent work on coresets, see [AH], [HV2], and [BHI]. Lemma 5.3. For P, k as above, and 0 < ε < 12 , the following hold: (i) If Q is an ε-coreset of P, then (1 − ε)ropt( P, k) ≤ ropt(Q, k) ≤ ropt( P, k) ≤ (1 + 2ε)ropt(Q, k). (ii) If Q is an ε-coreset of P and Q is a coreset of P , then Q ∪ Q is a coreset of P ∪ P . (iii) If Q1 ⊆ Q2 ⊆ · · · ⊆ Qm , where Qi is an ε-coreset of Qi+1, for i = 1, . . . , m − 1, then Q1 is an δ-coreset of Qm , where δ = εme2εm . Proof. (i) Clearly, ropt(Q, k) ≤ ropt( P, k), as Q ⊆ P. Furthermore, ropt(Q, k) + εropt( P, k) ≥ ropt( P, k), as we can convert the optimal k-clustering of Q into a clustering of P, by expanding each ball by εropt( P, k). Thus, ropt(Q, k) ≥ (1 − ε)ropt( P, k). Also, (1 + 2ε)ropt(Q, k) ≥ (1 + 2ε)(1 − ε)ropt( P, k) ≥ ropt( P, k), as ε < 12 . (ii) Let C be a k-clustering of Q ∪ Q . By definition, P ⊆ C(εropt( P, k)) and P ⊆ C(εropt( P , k)), as C is a k-clustering of Q and Q separately. Finally, ropt( P, k), ropt( P ,k) ≤ ropt( P ∪ P , k). Thus, P ∪ P ⊆ C(εropt( P ∪ P , k)). (iii) Let C1 be a clustering of Q1 by k balls, and let Ci = Ci−1(εropt(Qi , k)), for i = 2, . . . , m. By induction Qm ⊆ Cm . Furthermore, Cm = C1(ρ), where ρ = εropt(Qi , k) ≤ ε m i=2 = δ · ropt(Qm , k), m i=2 (1 + 2ε)m−i ropt(Qm , k) ≤ εme2εmropt(Qm , k) as 1+x ≤ ex , for 0 ≤ x ≤ 1, and δ = εme2εm . We conclude that Qm ⊆ C1(δropt(Qm , k)), and thus Q1 is a δ-coreset of Qm . Note that Lemma 5.3 holds verbatim if the point sets are moving. Theorem 5.4. Let P be a set of moving points in Rd with algebraic degree of motion µ . One can compute an ε-coreset of P of size O((k/εd )µ +1), in O(nk/εd ) time. Proof. Compute an (ε/2, O((k/εd )µ +1), k)-static clustering of P using the algorithm of Theorem 5.1. Next pick arbitrarily a representative point from each static cluster, and let Q be the resulting point set. It is now easy to verify that Q is an ε-coreset of P. Note that we can now maintain the k-clustering of a moving point set by computing its ε-coreset, and clustering the coreset whenever we want the exact clustering. Theorem 5.5. One can maintain an ε-coreset for k-clustering of a point set moving in bounded polynomial motion in Rd . The operations insertion, deletion, and motion update can all be handled in time poly(k, 1/ε, log n), where n is the maximum number of points in the data structure at any time. Furthermore, one can extract a (2 + ε)-approximate k-clustering, at any time, in the same time bound. Proof. Clearly, once we have an ε-coreset of the current point set in the data structure, we can report the 2 + ε approximate clustering by just computing it directly from the coreset using the algorithm of Gonzalez [G]. This would take O(k M ), where M is the size of the coreset. So, we need to show how we can maintain the ε-coreset of a moving point set, under insertions and deletions (motion updates can be performed by a deletion followed by an insertion). The basic idea was introduced in [AH]: Let T be a balanced binary tree that stores the points in its leaves, and store inside each interior node v of T a coreset of the points stored in the subtree of v. Let this set be denoted by coreset(v). If u and w are the two children of v, we can compute coreset(v) by computing a δcoreset of coreset(u) ∪ coreset(w) using the algorithm of Theorem 5.4, where δ is to be specified shortly. Clearly, insertions and deletions can be performed by doing rotations in T , and we can always fix the coresets of the changed nodes by recomputing them from their children coresets. Clearly, all this can be done while keeping the tree depth smaller than 5 log n, and the running time is O(poly(1/δ, k, log n)). We now apply Lemma 5.3(iii), with m = 5 log n and δ being the approximation factor. Clearly, coreset(r oot (T )) is a ρ-coreset of the points stored in T , where ρ = δme2δm = 5e10δ log nδ log n. In particular, setting δ = ε/(20 log n), we have ρ = εe1/2/4 ≤ ε. Namely, coreset (root(T )) is an ε-coreset of all the points stored in the tree. As for the running time, one can verify that all the operations can be performed in O(poly(1/ε, k, log n)). Finally, computing the k-clustering from the root coreset takes O(k M ) = O(poly(k, 1/ε, log n)) time as M = O(poly(k, 1/ε, log n)). The solution of Theorem 5.5 might be of limited use in practice, as sometime we would like every point to be marked by the current cluster that it is contained in. One option is to take the ε-coreset maintained by the data structure of Theorem 5.5 and explicitly maintain its clustering, using a kinetic data structure (KDS) as done in the following (naive) result. For more information about KDS and relevant definitions, see [BGH]. Lemma 5.6. Let P be a set of moving points in Rd . One can maintain a 2-approximate k-center clustering of P , such that at any time there are O (nk) active certificates. Furthermore, when one of the certificates is violated, the clustering and the associated set of certificates can be recomputed in O (nk) time. Moreover, if the points move with polynomial motion of degree µ , then there would be at most O (n4) events throughout the execution of the algorithm. Proof. The algorithm follows the following naive approach: We run Gonzalez’s algorithm [G] on P at the current time t , and we compute for each comparison of the type x [t ]y[t ] ≤ z[t ]w[t ] a certificate (one can implement Gonzalez’s algorithm so that it uses only this kind of comparison), where x , y, z, w ∈ P . Whenever one of these certificates is being violated, we stop and recompute the clustering and the set of certificates. Finally, the bound on the number of events follows immediately from Lemma 4.6. Note that at any point in time, we know for every point which cluster it belongs to, and the current set of clusters. Note that this result is far from satisfactory from a KDS point of view, as the number of events and the time spent on each event are prohibitive. However, using coresets we can trade off between accuracy and the number of events. Namely, we can maintain the k-clustering of the coreset using Lemma 5.6. This results in a KDS handling O (n poly(k, 1/ε, log n)) events (instead of O (n4)) that maintains a (2 + ε)-approximate k-clustering. The details are straightforward, and we omit them here. Approximate k-Center Clustering In this section we show how one can compute a 2-approximate clustering in expected linear time. We achieve this by using the speed-up technique of Section 2, together with some ideas from Feder and Greene [FG]. Observe that in some clustering problems, one can replace the point-location data structure by an approximate point-location data structure. In this case we are allowed to inflate the clusters slightly, and the data structure has to decide whether the points are inside the inflated clusters (in fact, we allow uncertainty in this region). Definition 6.1. A γ -approximate point-location data structure is a data structure that receives a set C of u clusters as input and preprocesses them in TP A(u) time, so that given a point p it can decide in O (TQ (m)) time whether a point lies inside C, or alternatively is in distance larger than γ r(C) from C. If the point is in the middle, returning either result inside/outside is allowed. In low dimensions, constructing such an approximate point-location data structure is quite easy, as the following lemma demonstrates. Definition 6.2. Given a set Q of balls in Rd , all of radius at most l, let G be the uniform grid of side-length l, let Z (Q, l) denote the set of cubes of G that intersects the balls of Q, and let UG (Q, l) = z∈Z(Q,l) z denote the union of those cubes. Note that if a point is inside UG (Q, l), then it is in distance at most (1 + √d)l from one of the centers of Q. Lemma 6.3. Given a set Q of k balls in Rd of radius at most l, one can preprocess them in O(k) time, so that given a query point, one can decide, in O( 1 ) time, whether a point is inside UG = UG (Q, l). In particular, given a set P of n points, one can compute all the points in P\UG in O(n + k) time. Proof. Compute Z (Q, l) in O(k) time. Since all the cubes of Z (Q, l) are cells in a grid, one can answer a point-location query that decides for a point q whether or not it is inside Z (Q, l), in O( 1 ) time, by using a hash-table to check if the grid cell that contains q is in Z (Q, l). Lemma 6.4. Given a set P of n points in Rd , one can compute a (4 + 2√d, k, k)clustering of P in O(n) expected time, for k ≤ √3n/ log n. Proof. We implement FastCluster by using the algorithm of Gonzalez [G] as the clustering subroutine. Gonzalez’s algorithm computes a (2, k, k)-clustering of m points in O(mk) time. We set ε = √k/n, and use a sample of size m = O((k log n)/ε) = O(√nk log n). Thus, this stage takes O(mk) = O(√nk3/2 log n) = O(n) time, and we compute a k-clustering C . Next, for the point-location stage, we use the data structure of Lemma 6.3. This yields a ((√d + 1)l, 2k, k)-clustering C of P, where l = r(C )/ropt( P, k) ≤ 2ropt( P, k)/ropt( P, k) = 2. Thus, C is a (2√d + 2, 2k, k)clustering of P. By clustering representative points (each point of P is in distance ≤ (2√d + 2)ropt( P, k) from its closest representative point) from each cluster of C , and applying Gonzalez’s algorithm [G] we get a (2√d + 4, k, k)-clustering C of P in O(n + k2) = O(n) expected time. We now show how to transform a (large) constant approximation to the optimal kcenter clustering into a 2-approximation. The following is a simple extension of the algorithm of Feder and Greene [FG], and is presented here as the implementation of Feder and Greene [FG] relies on a structure constructed in an earlier stage of their algorithm, and our version is independent of this structure. Lemma 6.5. Let P be a set of n points in Rd , and let k, r be parameters, so that ropt ≤ r ≤ c · ropt, where c ≥ 1 is a constant and ropt = ropt( P, k). Then one can compute a 2-approximation to the optimal k clustering of P in O(n + k log k) time. Proof. Let G be a partition of Rd into a uniform grid, with side length = r/(2c√d) ≥ ropt/(2c√d). Note that ≤ c · ropt/(2c√d) = ropt/(2√d). Compute for each of the points of P the cell of G that contains it, and compute for each grid cell the points assigned to it (using hashing and bucketing), in overall linear time (note that we need the floor function to implement this efficiently). O (S(4incc√edea+ch1b)adl)l o=f thOe(o1p)ticmelallscolufsttheeringgricda,nitinftoelrlsoewcst atthmatotswt(o2rnoopnt(-Pem,kp)t/y gr+id1)cdel=ls in distance at least 2r from each other cannot participate in the same cluster in the 2-approximation to the optimal clustering. We compute for each non-empty grid cell c of G (there are at most O (k) such non-empty cells) the O ( 1 ) grid cells in distance at most 2r from c which are not empty. For each cell c, let N (c) denote this cell list of neighbors. We now implement the algorithm of Gonzalez [G] on P . We remind the reader that the algorithm of Gonzalez [G] repeatedly picks the point furthest away from the current set of centers as the next center to be added, and updates for each point its distance to its closest center. In our setting, each time we pick a new center p, which lies in a grid cell c, we visit all the neighbors N (c) of c and update the distance to the closest point for all the points of P stored in those cells. (Note that we do not have to update cells that are further away from c as they lie on different clusters in any 2-approximate clustering.) Furthermore, a cell can contain only a single center point, as its diameter is at most √d < ropt, and thus can initiate such a distance update only once. In particular, the points stored in a cell c might be scanned at most O (|N (c)|) = O ( 1 ) times. Furthermore, the algorithm maintains for each cell c the point of P inside it which is furthest away from the current set of centers. This representative point is the point that the algorithm might pick for its next center (i.e., since the algorithm always picks the next center to be the point which is furthest away from the current set of centers, the representative is the only point that is relevant for the algorithm inside the cell). Since each point is being scanned a constant number of times, it follows that the overall running time associated with this operation is O (n). Also, the algorithm needs to maintain a heap of the points of P that might serve as the next center. Here, the heap is sorted by the distance of the points to their nearest center, and the top of the heap is the point that maximizes this quantity. Instead of maintaining a heap for n points, we instead maintain a heap for the O (k) representative points. Since each cell has a single such representative, and the algorithm performs O (k) insertions, weight-update, and deletemax operations on this heap, it follows that we can implement each heap operation in O (log k) time. Thus, the overall running time of the algorithm is O (n + k log k) time. Combining Lemmas 6.4 and 6.5, we establish the following result: Theorem 6.6. Given a set P of n points, one can compute a 2-approximate k-clustering of P , in O (n) expected time, for k = O ( √3n/ log n). Note that the algorithm of Theorem 6.6 is very simple. It boils down to repeated usage of random sampling, hashing, and using the (rather simple) clustering algorithm of Gonzalez [G] as a subroutine. Conclusions In this paper we proved that there is a small static clustering of any set of moving points and presented an efficient algorithm for computing this clustering. We also show how to maintain static clustering efficiently under insertions and deletions. No previous results of this type were known for moving points, and this presents to our knowledge the first efficient dynamic data structure for maintaining (approximate) k-center clustering of a moving point set. We also described a simple technique for speeding-up clustering algorithms. In a certain sense, our speeding-up technique implies that clustering is easy if one is willing to compromise on the number of clusters used. We use it to derive a linear time algorithm for 2-approximate k-center clustering in Euclidean space. We believe that the ability to do 2-approximate clustering in (expected) linear time, even under our strong computation model, is interesting and surprising. Our result can be interpreted as extending the techniques of Rabin [S], [GRSS], [R] for closest-pair computation to clustering. We note that our results about the small coreset for k-center clustering, implies that one can maintain such approximate clustering on a stream of points, using O (poly(k, 1/ε, log n)) overall space (each insertion can be handled in similar time). This follows because any measure that has a small coreset can be streamed, see [AHV] for details. One open question for further research is to develop an algorithm with a tradeoff between clustering quality and I/O efficiency. Acknowledgments The author thanks Pankaj Agarwal, Boris Aronov, Mark de Berg, Otfried Cheong, Alon Efrat, Jeff Erickson, Piotr Indyk, Mark Overmars, Lenny Pitt, Magda Procopiuc, and Micha Sharir for helpful discussions and suggestions. Neither last nor least, the author thanks Miranda R. Callahan for her comments on the manuscript. Finally, the author thanks again the anonymous referees for their insightful and useful comments on the writeup. [AAE] P. K. Agarwal , L. Arge , and J. Erickson . Indexing moving points . In Proc. 19th ACM Sympos. Principles Database Systems , pages 175 - 186 , 2000 . [ADPR] N. Alon , S. Dar , M. Parnas , and D. Ron . Testing of clustering . In Proc. 41st Annu. IEEE Sympos. Found. Comput. Sci. , pages 240 - 250 , 2000 . [AH] P. K. Agarwal and S. Har-Peled . Maintaining the approximate extent measures of moving points . In Proc. 12th ACM-SIAM Sympos. Discrete Algorithms , pages 148 - 157 , 2001 . [AHV] P. K. Agarwal , S. Har-Peled , and K. R. Varadarajan . Approximating extent measures of points . J. Assoc. Comput. Mach. , 2003 , to appear. [AP] P. K. Agarwal and C. M. Procopiuc . Approximation algorithms for projective clustering . In Proc. 11th ACM-SIAM Sympos. Discrete Algorithms , pages 538 - 547 , 2000 . [BDIZ] J. Basch , H. Devarajan , P. Indyk , and L. Zhang . Probabilistic analysis for combinatorial functions of moving points . In Proc. 13th Annu. ACM Sympos. Comput. Geom. , pages 442 - 444 , 1997 . [BE] M. Bern and D. Eppstein . Approximation algorithms for geometric problems . In D. S. Hochbaum, editor, Approximationg Algorithms for NP-Hard Problems , pages 296 - 345 . PWS, Boston, MA, 1997 . [BGH] J. Basch , L. J. Guibas , and J. Hershberger . Data structures for mobile data . J. Algorithms , 31 ( 1 ): 1 - 28 , 1999 . [BGSZ] J. Basch , L. J. Guibas , C. Silverstein , and L. Zhang. A practical evaluation of kinetic data structures . In Proc. 13th Annu. ACM Sympos. Comput. Geom. , pages 388 - 390 , 1997 . [BHI] M. Ba ˘doiu, S. Har-Peled, and P. Indyk . Approximate clustering via core-sets . In Proc. 34th Annu. ACM Sympos. Theory Comput. , pages 250 - 257 , 2002 . [CCFM] M. Charikar , C. Chekuri , T. Feder , and R. Motwani . Incremental clustering and dynamic information retrieval . In Proc. 29th Annu. ACM Sympos. Theory Comput. , pages 626 - 635 , 1997 . [dBvKOS] M. de Berg , M. van Kreveld , M. H. Overmars , and O. Schwarzkopf . Computational Geometry: Algorithms and Applications, 2nd edition . Springer-Verlag, New York 2000 . [DG] O. Devillers and M. Golin . Dog bites postman: point location in the moving Voronoi diagram and related problems . Internat. J. Comput. Geom. Appl. , 8 : 321 - 342 , 1998 . [DHS] R. O. Duda , P. E. Hart , and D. G. Stork. Pattern Classification , 2nd edition . Wiley-Interscience, New York, 2001 . [FG] T. Feder and D. H. Greene . Optimal algorithms for approximate clustering . In Proc. 20th Annu. ACM Sympos. Theory Comput. , pages 434 - 444 , 1988 . [G] T. Gonzalez . Clustering to minimize the maximum intercluster distance . Theoret. Comput. Sci. , 38 : 293 - 306 , 1985 . [GGH+] J. Gao , L. Guibas , J. Hershberger , L. Zhang , and A. Zhu . Discrete mobile centers . In Proc. 17th Annu. ACM Sympos. Comput. Geom. , pages 188 - 196 , 2001 . [GRSS] M. Golin , R. Raman , C. Schwarz , and M. Smid . Simple randomized algorithms for closest pair problems . Nordic J. Comput. , 2 : 3 - 27 , 1995 . [HV1] S. Har-Peled and K. R. Varadarajan . Approximate shape fitting via linearization . In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci. , pages 66 - 73 , 2001 . [HV2] S. Har-Peled and K. R. Varadarajan . Projective clustering in high dimensions using core-sets . In Proc. 18th Annu. ACM Sympos. Comput. Geom. , pages 312 - 318 , 2002 . [HW] D. Haussler and E. Welzl. ε-Nets and simplex range queries . Discrete Comput. Geom. , 2 : 127 - 151 , 1987 . [I] P. Indyk . Sublinear time algorithms for metric space problems . In Proc. 31st Annu. ACM Sympos. Theory Comput. , pages 154 - 159 , 1999 . [MOP] N. Mishra , D. Oblinger , and L. Pitt . Sublinear time approximate clustering . In Proc. 12th ACMSIAM Sympos. Discrete Algorithms , pages 439 - 447 , 2001 . [MT] N. Megiddo and A. Tamir . On the complexity of locating linear facilities in the plane . Oper. Res. Lett. , 1 : 194 - 197 , 1982 . [R] M. O. Rabin . Probabilistic algorithms . In J. F. Traub, editor, Algorithms and Complexity: New Directions and Recent Results , pages 21 - 39 . Academic Press, New York, 1976 . [S] M. Smid . Closest-point problems in computational geometry . In J. -R. Sack and J. Urrutia, editors, Handbook of Computational Geometry , pages 877 - 935 . Elsevier North-Holland, Amsterdam, 2000 . [SA] M. Sharir and P. K. Agarwal . Davenport-Schinzel Sequences and Their Geometric Applications . Cambridge University Press, New York, 1995 . [VC] V. N. Vapnik and A. Y. Chervonenkis . On the uniform convergence of relative frequencies of events to their probabilities . Theory Probab. Appl. , 16 : 264 - 280 , 1971 .


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs00454-004-2822-7.pdf

Sariel Har-Peled. Clustering Motion, Discrete & Computational Geometry, 2004, 545-565, DOI: 10.1007/s00454-004-2822-7