A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization

Discrete & Computational Geometry, Apr 2001

Random sampling is an efficient method to deal with constrained optimization problems in computational geometry. In a first step, one finds the optimal solution subject to a random subset of the constraints; in many cases, the expected number of constraints still violated by that solution is then significantly smaller than the overall number of constraints that remain. This phenomenon can be exploited in several ways, and typically results in simple and asymptotically fast algorithms. Very often the analysis of random sampling in this context boils down to a simple identity (the sampling lemma ) which holds in a general framework, yet has not been stated explicitly in the literature. In the more restricted but still general setting of LP-type problems , we prove tail estimates for the sampling lemma, giving Chernoff-type bounds for the number of constraints violated by the solution of a random subset. As an application, we provide the first theoretical analysis of multiple pricing , a heuristic used in the simplex method for linear programming in order to reduce a large problem to few small ones. This follows from our analysis of a reduction scheme for general LP-type problems, which can be considered as a simplification of an algorithm due to Clarkson. The simplified version needs less random resources and allows a Chernoff-type tail estimate.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs00454-001-0006-2.pdf

A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization

Discrete Comput Geom A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization¤ B. Ga¨rtner 0 E. Welzl 0 0 Institut fu ̈r Theoretische Informatik, ETH Zu ̈rich, ETH Zentrum , CH-8092 Zu ̈rich , Switzerland Random sampling is an efficient method to deal with constrained optimization problems in computational geometry. In a first step, one finds the optimal solution subject to a random subset of the constraints; in many cases, the expected number of constraints still violated by that solution is then significantly smaller than the overall number of constraints that remain. This phenomenon can be exploited in several ways, and typically results in simple and asymptotically fast algorithms. Very often the analysis of random sampling in this context boils down to a simple identity (the sampling lemma) which holds in a general framework, yet has not been stated explicitly in the literature. In the more restricted but still general setting of LP-type problems, we prove tail estimates for the sampling lemma, giving Chernoff-type bounds for the number of constraints violated by the solution of a random subset. As an application, we provide the first theoretical analysis of multiple pricing, a heuristic used in the simplex method for linear programming in order to reduce a large problem to few small ones. This follows from our analysis of a reduction scheme for general LP-type problems, which can be considered as a simplification of an algorithm due to Clarkson. The simplified version needs less random resources and allows a Chernoff-type tail estimate. ¤ The first author acknowledges support from the Swiss Science Foundation (SNF), Project No. 2150647.97. A preliminary version of this paper appeared in the Proceedings of the 16th Annual ACM Symposium on Computational Geometry (SCG), 2000, pp. 91-99. - Random sampling and randomized incremental construction have become well-established, by now even classical, design paradigms in the field of computational geometry, see [ 27 ]. Many algorithms following that paradigm have been simplified to a point where they can easily be taught in introductory CS courses, with almost no technical difficulties. This was not always the case; pioneering papers, notably the ones by Clarkson and Shor [ 6 ], [ 9 ], Mulmuley [ 26 ], and by Guibas et al. [ 18 ], still required more technical derivations. This changed when Seidel popularized the backwards analysis paradigm for randomized algorithms [ 30 ]. Together with the abstract framework of configuration spaces, this technique allows us to treat many different algorithms in a simple and unified way [ 11 ]. The goal of this paper is to popularize and prove results around a simple identity (the sampling lemma) which underlies the analysis of randomized algorithms for many geometric optimization problems. By that we mean problems defined in a low-dimensional space, which usually implies that they have few constraints or few variables when written as mathematical programs. As we show below, special cases of the identity, or inequalities implied by it, are used in many places, including the analysis of the general configuration space framework. To the knowledge of the authors, the identity itself, however, has not been noticed explicitly. The Sampling Lemma Let S be a set of size n and let ' be a function that maps any set R µ S to some value '.R/.1 Define V .R/ :D fs 2 SnR j '.R [ fsg/ 6D '.R/g; X .R/ :D fs 2 R j '.Rnfsg/ 6D '.R/g: V .R/ is the set of violators of R, while X .R/ is the set of extreme elements in R. Obviously, s violates R , s is extreme in R [ fsg: .1/ For a random sample R of size r , i.e. a set R chosen uniformly at random from the set ¡rS¢ of all r -element subsets of S, we define random variables Vr : R 7! jV .R/j and Xr : R 7! jX .R/j, and we consider the expected values Lemma 1.1 (Sampling Lemma). For 0 · r < n, vr :D E .Vr /; xr :D E .Xr /: vr xrC1 : n ¡ r D r C 1 1 Here, the only purpose of ' is to partition 2S into equivalence classes; later, the function-notation becomes clear. Proof. Using the definitions of vr and xrC1 as well as (1), we can argue as follows: µn¶ r vr D D D D X X [s violates R] R2.rS/ s2SnR X R2.rS/ s2SnR X Q2.rCS1/ s2Q µ n ¶ r C 1 xrC1: X [s is extreme in R [ fsg] X[s is extreme in Q] Here, [¢] is the indicator variable for the event in brackets. Finally, ¡r Cn1¢=¡nr¢ D .n ¡ r /=.r C 1/. To appreciate the simplicity (if not triviality) of the lemma, one should consider it as a special case of the following observation: given a bipartite graph, the average vertex degree in one color class times the size of that class equals the average vertex degree in the other color class times its size. In our case, the two color classes are the subsets of S of sizes r and r C 1, respectively, and two sets R and R [ fsg share an edge if and only if s violates R (equivalently, if s is extreme in R [ fsg). This means, the sampling lemma still holds if “violation” is individually defined for every pair .R; s/. A situation of quite similar flavor, where a simple bipartite graph underlies a probabilistic scenario, has been studied by Dubhashi and Ranjan [ 12 ]. We can also establish a version of the sampling lemma in the model of Bernoulli sampling, where R is chosen by picking each element of S independently with some fixed probability p 2 [0; 1] (we say R is a random p-sample). Let V .p/ and X .p/ denote the random variables for the number of violators and extreme elements, respectively, in a p-sample, and let v.p/ and x .p/ be the corresponding expectations. Lemma 1.2 ( p-Sampling Lemma). For 0 · p · 1, Proof. Each r -element set R occurs as a p-sample with probability pv.p/ D .1 ¡ p/x .p/: µn¶ pr .1 ¡ p/n¡r : r Using the Sampling Lemma 1.1 it follows that rD0 D p Xn¡1 µ n ¶ pr .1 ¡ p/n¡r xrC1 D p Xn µn¶ pr¡1.1 ¡ p/n¡rC1xr r C 1 r rD1 D .1 ¡ p/ Xn µn¶ pr .1 ¡ p/n¡r xr D .1 ¡ p/ Xn µn¶ pr .1 ¡ p/n¡r xr r r rD0 In the next section we discuss some well-known results obtained by random sampling and show that all of them easily follow from the sampling (respectively p-sampling) lemma. Concentrating on the Sampling Lemma 1.1, we elaborate on its connection to configuration spaces and backwards analysis. Section 3 deals with LP-type problems, which can be considered as functions ' with specific properties. Section 4 establishes Chernoff-type tail estimates for the random variable Vr , i.e. for the number of violators of a random sample. The sampling lemma and the tail estimates are finally used in Section 5 to analyze an algorithm for general LP-type problems, which can be considered as the “practical” version of Clarkson’s reduction scheme [ 16 ]. Its specialization to linear programming is a variant of multiple pricing [ 5 ]. 2. Incarnations of the Sampling Lemmata Searching in a Sorted Compact List A sorted compact list represents a set S of n ordered keys in an array, where the order among the keys is established by additional pointers linking each element to its predecessor in the order, see Fig. 1. It is well known that the smallest key in a sorted compact list can be found in O.pn/ expected time [10, Problem 11-3]. For this, one draws a random sample R of r D 2.pn/ keys, finds the smallest key s0 in the sample, and finally follows the links from s0 to the overall smallest key. The efficiency comes from the fact that an expected number of only 2.pn/ keys is still smaller than s0. In general, setting '.R/ D min.R/ and observing that XrC1 ´ 1, the sampling lemma yields E .#fs 2 SnR j s < min.R/g/ D nr C¡ 1r : .2/ Note that s < min.R/ is equivalent to min.R [ fsg/ 6D min.R/. Property (2) was exploited by Seidel in the following observation: given a simple d-polytope P with n vertices, specified by its 1-skeleton (the graph of vertices and edges of P), one can find the vertex that minimizes some linear function f in expected time O.dpn/. The corresponding randomized subroutine serves as a building block of a simple algorithm for computing the intersection of halfspaces, or, dually, the convex hull 8 5 4 2 1 6 3 7 of points in d-dimensional space. For d ¸ 4, this algorithm achieves optimal expected worst-case performance [ 31 ]. Smallest Enclosing Ball Consider the problem of computing the smallest enclosing ball of a set S of n points in d-dimensional space, for some fixed d. Randomized incremental algorithms do this in expected O.n/ time [ 33 ], based on the following fact: if the points are added in random order, the probability that the nth point is outside the smallest enclosing ball of the first n ¡ 1 points is bounded by .d C 1/=n. In general, it holds that if R µ S is a random sample of r points, and ball.R/ denotes the smallest enclosing ball of R, then E .#f p 2 SnR j p 62 ball.R/g/ · .d C 1/ .3/ n ¡ r : r C 1 Again, this follows from the sampling lemma, with '.R/ D ball.R/, together with the observation that any set R has at most d C 1 extreme elements [ 33 ], and the fact that s 62 ball.R/ , ball.R [ fsg/ 6D ball.R/. Similar results hold for the smallest enclosing ellipsoid problem. The randomized incremental algorithm based on them was the first one to achieve an expected runtime of O.n/ for that problem, see [ 33 ]. The pioneering applications of randomized incremental construction along these lines were Clarkson’s and Seidel’s linear-time algorithms for linear programming with a fixed number d of variables [ 8 ], [ 29 ]. Planar Convex Hull For a planar point set S, jSj D n, the randomized incremental construction adds the points in random order, always maintaining the convex hull of the points added so far. When a point p is added, it has to “locate” itself, i.e. it has to know whether it is outside the current convex hull, and in this case identify some hull edge e visible from p. As it turns out, the amortized expected cost for doing this in the r th step (after which the points added so far form a random sample R of size r ) is proportional to ar =r , where ar :D E .#f p 2 SnR j p 62 conv.R/g/: The “trick” now is to express this in terms of another quantity: br :D E .#f p 2 R j p vertex of conv.R/g/: The sampling lemma with '.R/ D conv.R/ then shows that ar D brC1 nr C¡ 1r : .4/ For this, we need the observation that p 62 conv.R/ is equivalent to conv.R [ fsg/ 6D conv.R/, which in turn means that p is a vertex of conv.R [ fsg/. The expected overall location cost (which dominates the runtime) is then proportional to Xn ar · n Xn brC1 rD1 r rD1 r .r C 1/ : Because brC1 · r C 1, this gives an O.n log n/ algorithm. However, the bound is much better in some cases. For example, if the input points are chosen randomly from the unit square (unit disk, respectively), we get br D O.log r / (br D O. p3r /, respectively) [ 28 ], [ 20 ]. In both cases the algorithm actually runs in linear time. In higher dimensions, an analysis along these lines is available, but requires substantial refinements [ 9 ], [ 30 ]. Minimum Spanning Forests Let G D .V ; E / be an edge-weighted graph, jV j D n. For D µ E , let msf.D/ denote the minimum spanning forest of the graph .V ; D/ (which we assume to be unique for all D). An edge e 2 E is called D-light if it either connects two components of msf.D/ or it has smaller weight than some edge on the unique path in msf.D/ between its two vertices. The expected linear-time algorithm for computing msf.E / due to Karger et al. [ 21 ], [ 25 ] relies (among other insights) on the following fact: if D is a random psample, the expected number of D-light edges is bounded by n= p. Using the p-Sampling Lemma 1.2, this fact is easily derived. Namely, it is a simple observation that e is D-light if and only if msf.D/ 6D msf.D [ feg/. With '.D/ D msf.D/, this means that the set of D-light edges is exactly the set of violators of D. By the p-sampling lemma, if D is a random p-sample, their expected number is given by v.p/ D 1 ¡ p x .p/ p x .p/ · p : It remains to observe that x .p/ · n ¡ 1, because X .D/ contains exactly the edges in msf.D/, for all D. Along these lines, Chan has proved a bound for the expected number of D-light edges in the case where D is a random sample of size r [ 4 ]. His argument uses backwards analysis and boils down to a proof of the Sampling Lemma 1.1 in this specific scenario. Backwards Analysis and Configuration Spaces The Sampling Lemma 1.1 in its full generality can be easily proved using backwards analysis, and as indicated in the previous subsection, this is usually the way its specializations are derived in the applications. For this, one considers the randomized incremental “construction” of '.S/, via adding the elements of S in random order, and analyzes the situation in step r C 1 [ 30 ]. There is also a connection to configuration spaces. In general, such a space consists of an abstract set of configurations over some set S, where each configuration 1 has a defining set D.1/ µ S and a conflict set K .1/ µ S. 1 is active with respect to R µ S if and only if D.1/ µ R and K .1/ µ SnR. The goal is to compute the configurations active with respect to S, by adding the elements in random order, always maintaining the active configurations of the current subset. The abstract framework provides bounds for the expected overall structural change (number of configurations ever becoming active) during that construction [ 9 ], [ 27 ], [ 11 ]. In our case, every subset R has exactly one active configuration 1 D '.R/ associated with it, where D.1/ D X .R/ and K .1/ D V .R/.2 In this case the sampling lemma provides a bound for the expected structural change vr =.n ¡ r / that occurs in step r C 1. For example, it specializes to Theorem 9.14 of [ 11 ] if xrC1 is bounded by a constant d. In the following we are interested not only in the expectation but also in the distribution of the random variable Vr , something the configuration space framework does not handle. For this, we concentrate on the case in which .S; '/ has the structure of an LP-type problem. This situation covers many important optimization problems, including linear programming and all motivating examples discussed above. 3. LP-Type Problems If ' maps subsets to some ordered set O, we can consider functions ' that are monotone, i.e. '.F / · '.G/ for F µ G. In this situation, we can regard a pair .S; '/ as an optimization problem over O, as follows: S is an abstract set of constraints, and for any R µ S, '.R/ represents the minimum value in O subject to the constraints in R. The examples above are all of this type, if we define appropriate orderings on the '-values. For '.R/ D min.R/ in the case of keys, we simply take the decreasing order on the keys. For S a point set and '.R/ D ball.R/, we can order the balls according to their radii, while for '.R/ D conv.R/, we may use the area of conv.R/. Moreover, in all these examples, ' has another special property which we refer to as the locality. We say that ' is local if R µ Q and '.R/ D '.Q/ implies V .R/ D V .Q/, for all R; Q ½ S. An example for a nonlocal problem is the diameter: for a set S of points and R µ S, we define '.R/ to be the euclidean diameter of R. In Fig. 2 we have '.R/ D '.Q/ for R D fq; sg and Q D f p; q; sg, but ; D V .R/ 6D V .Q/ D fr g: Still, locality is present in many problems of practical relevance, the most prominent one being linear programming (LP). In a geometric formulation of linear programming, S is a set of halfspaces in d-dimensional space, and '.R/ is the lexicographically smallest point among all the ones that minimize some fixed linear function over the intersection p >D r D s q 2 Some care is in order here; in degenerate situations, R can define several configurations 1 with different sets D.1/, in which case X .R/ is the intersection of all those sets. of all halfspaces in R. If that intersection is empty, we set '.R/ D 1, with the understanding that this value dominates all other values. If the function is unbounded over the intersection, we set '.R/ D ?, standing for “undefined.” Linear programming is also the motivating example for the following definition [ 32 ]. Definition 3.1. Let S be a finite set, O some ordered set, and ': 2S ! O [ f?g a function, where ? is assumed to be the minimum value in O [ f?g. The pair .S; '/ is called an LP-type problem if ' is monotone and local, i.e. if for all R µ Q µ S with '.R/ 6D ?, (i) '.R/ · '.Q/, and (ii) '.R/ D '.Q/ implies V .R/ D V .Q/. The concept of LP-type problems has proved useful in the understanding of geometric optimization, see for example [ 2 ]. For many problems (including linear programming and smallest enclosing ball), the currently best theoretical runtime bounds in the unit cost model can be obtained by an algorithm that works for general LP-type problems [ 16 ], [ 23 ]. We recall the following further notations only briefly and refer to the above literature for details. Definition 3.2. Let L D .S; '/ be an LP-type problem. (i) A basis of R µ S is an inclusion-minimal subset B µ R with '.B/ D '.R/. A basis in L is a basis of some set R µ S. A basis in R is a basis in L contained in R. (ii) The combinatorial dimension of L, denoted by ± D ±.L/, is the size of a largest basis in L. (iii) L is regular if all bases of sets R, jRj ¸ ± (regular bases), have size exactly ±. (iv) L is nondegenerate if every set R, jRj ¸ ±, has a unique basis B.R/. The following implications can easily be derived. Fact 3.3. Let L D .S; '/ be an LP-type problem and R µ S with '.R/ 6D ?. Then (i) '.R/ D '.SnV .R//, and (ii) the set X .R/ of extreme elements of R is the intersection of all bases of R. If L has combinatorial dimension ±, it follows that jX .R/j · ± for all R, so that the sampling lemma yields vr · ± n ¡ r : r C 1 In particular, a random sample of size r ¼ p±n has no more than r violators on average, and this is the “balancing” that will prove useful below. In the next section we derive bounds for regular, nondegenerate LP-type problems that apply to the general case only in a weaker form. While regularity can be enforced in the nondegenerate case (we describe a well-behaved “regularizing” construction below), Enforcing Regularity Given a nondegenerate LP-type problem .S; '/ of combinatorial dimension ±, the idea is to make it regular by “pumping up” bases which are too small. For this, we define an arbitrary linear order on S, and consider the function '0.R/ :D .'.R/; E .R//; where E .R/ consists of the vector of the m largest elements in RnB.R/, for m D min.±; jRj/¡jB.R/j. '0-values are compared lexicographically, i.e. by the '-component first. If the '-values are equal, the lexicographic order of the E -components (well defined with respect to the chosen order on S) decides the comparison. '0 can be considered as a “refinement” of '. Lemma 3.4 [ 22 ]. If L D .S; '/ is nondegenerate, then .S; '0/ is a regular, nondegenerate LP-type problem of combinatorial dimension ±.L/. Moreover, if V .R/ and V 0.R/ denote the violating sets of R µ S with respect to ' and '0, we have the following simple but important fact: V .R/ µ V 0.R/: This holds because '.R [fsg/ > '.R/ implies '0.R [fsg/ > '0.R/. It follows that when we develop tail estimates for the expected size of V 0.R/ (more generally, for any regular and nondegenerate LP-type problem), those estimates then also apply to nonregular problems. nondegeneracy is a more subtle issue. It is not known how to make a general LPtype problem nondegenerate without substantially changing its structure [ 22 ]. For most geometric LP-type problems, however, a slight perturbation of the input will entail a nondegenerate problem, essentially equivalent to the original one. Most notably, this is the case for linear programming. 4. Tail Estimates In the following we consider regular and nondegenerate LP-type problems .S; '/ with jSj D n and ±.S; '/ D d, where we assume n and d to be fixed for the rest of this section. For given parameters r ¸ d and k, we want to bound prob.Vr ¸ k/: The most important observation is that this quantity does not depend on the LP-type problem, but is merely a function of the parameters n; d; r , and k. This follows from a result first proved by Clarkson [ 7 ] in the context of linear programming, and later generalized to LP-type problems by Matousˇek [ 22 ]. We rederive the statement here. .5/ Theorem 4.1. Let .S; '/ be a regular, nondegenerate LP-type problem with jSj D n and ±.S; '/ D d. Then prob.Vr D k/ D ¡kCd¡1¢¡n¡d¡k¢ d¡1 r¡d ¡n¢ r : Proof. A basis B is the basis of a set R if and only if B µ R µ SnV .B/. This means, for any regular basis B with k violators, there are ¡n¡d¡k¢ sets R of size r which have B r¡d as their (unique) basis. It follows that ¡n¡d¡k¢ r¡d ¡n¢ r ; prob.Vr D k/ D bk r D d; : : : ; n; where bk is the number of regular bases with k violators in .S; '/. By summing over all k, we get µn¶ r kD0 D Xn¡d bk µn ¡r¡d ¡d k¶; r D d; : : : ; n: .6/ This system of linear equations can be written in the form µµn¶ µ ; d n ¶ d C 1 ; : : : ; µn¶¶ n D .bn¡d ; bn¡d¡1; : : : ; b0/ T ; where T is an upper-triangular matrix with all diagonal entries equal to 1, therefore invertible. This means the bk ’s are uniquely determined by the system (6), from which bk D µk C d ¡ 1¶ d ¡ 1 follows via a standard binomial coefficient identity [17, equation (5.26)]. This proves the statement of the theorem. This result leads to an explicit formula for prob.Vr ¸ k/, but useful tail estimates do not yet follow from that. By severe grinding it might be possible to extract good bounds directly from the formula (we did not succeed), but there is another approach: as we know that the quantity in question does not depend on the particular LP-type problem, we might as well use our favorite LP-type problem in the analysis. In fact, for any given parameters n and d, there is a “canonical” LP-type problem from which statements about the distribution of Vr can be extracted without pain. The d -Smallest Number Problem Let N be the set f1; : : : ; ng. For R µ N , define mind .R/ as the d-smallest number in R (equivalently, the element of rank d in R). If jRj < d, this is undefined, and mind .R/ :D ?. We have the following easy facts (proofs omitted). Lemma 4.2. (i) .N ; '/ with '.R/ :D mind .R/ is a regular, nondegenerate LP-type problem of combinatorial dimension d, if '-values are compared according to decreasing order in N . (ii) The basis of any set R, jRj ¸ d, consists of the d smallest numbers in R. (iii) s 2 SnR violates R if and only if s is smaller than the d-smallest number in R. For d D 1, we have mind .R/ D min.R/, thus we recover the LP-type problem underlying the efficient minimum search in a sorted compact list described in the Introduction. As a warm-up exercise, we rederive the formula for the number of bases with exactly k violators in a regular and nondegenerate LP-type problem, by using the fact that this number does not depend on the actual LP-type problem, see Theorem 4.1. Observation 4.3. The d-smallest number problem has Proof. Any set B with d elements is a regular basis. B has k violators if and only if the d-smallest number x in B is the (k C d)-smallest number in N . The elements in Bnfx g can be any d ¡ 1 among the k C d ¡ 1 smaller numbers in N . The proof of this observation might be somewhat simpler than the one we had in the general case, but it does not lead to new insights. However, the next theorem about higher moments of Vr is an example of a statement which we think is not immediate to prove (let alone discover) without making use of the d-smallest number problem. Theorem 4.4. Let .S; '/ be a regular, nondegenerate LP-type problem, and let R be a random sample of size r . For j 2 f0; : : : ; n ¡ r g, we have E µµVr ¶¶ j D ¡ n ¢¡ jCd¡1¢ rC j j ¡n¢ r : Proof. We evaluate the expectation for the d-smallest number problem and then use Theorem 4.1. For this, we need to count the expected number of sets J; j J j D j with J µ V .R/. Observe that this inclusion holds if and only if all elements of J are smaller than the d-smallest number in R, equivalently, if J is among the j C d ¡ 1 smallest numbers in R [ J . For any set L of size r C j , there are ¡ jCdj ¡1¢ pairs .R; J /, R [ J D L, with this property. Thus we get µn¶ r E µµVr ¶¶ j D X X [ J µ V .R/] jRjDr JµSnR jJjDj D D X jLjDrC j µ n r C j µ j C d ¡ 1¶ j ¶µ j C d ¡ 1¶ j : When applied to j D 2, the theorem can be used to compute the variance of Vr , leading to a Chebyshev-type tail estimate. The higher moments give still better bounds. We are going for Chernoff-type bounds, by exploiting the special structure of the d-smallest number problem. A Chernoff-Type Tail Estimate To choose a random subset R µ N of size r , one can proceed in r rounds, where round i selects an element si uniformly at random among the ones not chosen so far. Equivalently, one may choose a “rank” `i uniformly at random in f1; : : : ; n C 1 ¡ i g and let si be the element of rank `i among the ones not chosen so far. Fix some positive integer k and let Uk be the random variable for the number of indices i with `i · k. We have the following relation to the random variable Vr . Lemma 4.5. Let R D R.`/ denote the set determined by ` D .`1; : : : ; `r /. Then Uk .`/ ¸ d ) Vr .R/ · k ¡ 1: Proof. We claim that Uk ¸ d implies mind .R/ · k C d ¡ 1. Because the latter is equivalent to Vr · k ¡ 1, the lemma follows. To prove the claim, we first note that si D `i C #f j < i j sj < si g: .7/ Consider some set I of d indices i such that `i · k for i 2 I . Such a set exists if Uk ¸ d. If si · k C d ¡ 1 for all i 2 I , we get mind .R/ · k C d ¡ 1, as required. Otherwise, there is some i 2 I such that si D k C e; e ¸ d. Then we get #f j < i j sj < k C eg D k C e ¡ `i ¸ e; which implies #f j < i j sj < k C dg ¸ d. As before, this means that mind .R/ · k C d ¡ 1. Corollary 4.6. prob.Vr ¸ k/ · prob.Uk · d ¡ 1/: Chernoff-type bounds for Uk are easy to obtain now. Uk can be expressed as the sum of independent random variables Uk;i ; i D 1; : : : ; r , where Uk;i :D and it holds that k prob.Uk;i D 1/ D n C 1 ¡ i D: pi : The following is one of the basic Chernoff bounds [19]. Lemma 4.7. With E .Uk / D . p1 C ¢ ¢ ¢ C pr /=r and t ¸ 0, Using t D E .Uk / ¡ d C 1 (which is nonnegative for the values of k we will be interested in below), we obtain µ .E .Uk / ¡ d C 1/2 ¶ prob.Uk · d ¡ 1/ · exp ¡ 2E .Uk / : Fix some value ¸ ¸ 0 and choose k in such a way that E .Uk / D .1 C ¸/d. Then we get prob.Uk · d ¡ 1/ · exp · exp µ .¸d C 1/2 ¶ ¡ 2.1 C ¸/d µ ¸2 ¡ 2.1 C ¸/ ¶ d : The value of k that entails E .Uk / D .1 C ¸/d satisfies .1 C ¸/d k D PriD¡01 1=.n ¡ i / n · .1 C ¸/d ; r and we obtain our result. Theorem 4.8. Let L D .S; '/ be a nondegenerate LP-type problem with jSj D n and dim.S; '/ D d. For r ¸ d and any ¸ ¸ 0, ³ prob Vr ¸ .1 C ¸/d n ´ r · exp µ We have derived this bound only for regular problems, but as we have shown before, any problem can be regularized, and, by (5), the estimate then also holds for nonregular problems. Because E .Vr / · d.n ¡ r /=.r C 1/ ¼ dn=r , this bound establishes estimates for the tail “to the right” of the expectation. It might seem that the bound is rather weak, in particular because it does not depend on n and r . However, it is essentially best possible, as the following lower bound shows (the actual formulation has been chosen in order to minimize computational effort). Theorem 4.9. Let L D .S; '/ be a nondegenerate LP-type problem with jSj D n and dim.S; '/ D d. For r ¸ d and any ¸ ¸ 0 such that .1 C ¸/d · r=2, µ prob Vr > .1 C ¸/d n C 1 ¡ r r ¶ µ ¡ d ¸ exp ¡.1 C ¸/d ¡ r : Proof. With Uk as defined above and R D R.`/, relation (7) immediately entails Vr .R/ · k ¡ d , mind .R/ · k ) Uk .`/ ¸ d; so that we get prob.Vr > k ¡ d/ ¸ prob.Uk · d ¡ 1/. Furthermore, prob.Uk · d ¡ 1/ ¸ prob.Uk D 0/ D With k D .1 C ¸/d.n C 1 ¡ r /=r , it follows that Yr µ iD1 ¸ exp An open question is whether the statement of Theorem 4.8 also holds in the degenerate case. It is tempting to conjecture that prob.Vr ¸ k/ is maximized for nondegenerate problems—this would yield Theorem 4.8 for the general case. Moreover, while the bound is tight in the regular case, one might be able to improve it for a given nonregular problem. We conclude this section by proving a weaker tail estimate which applies to the general case. Using this, we can show that the number of violators exceeds the expected value by no more than a logarithmic factor, with high probability. Theorem 4.10. Let L D .S; '/ be an LP-type problem with jSj D n and dim.S; '/ D d. For r ¸ d and any ¸ ¸ 0, ³ ³ prob Vr ¸ ln ne d ´ n ´ C ¸ d r · exp .¡¸d/ : Proof. Let Bk denote the set of regular bases with exactly k violators (recall that a regular basis is a basis of some set R with jRj ¸ d). Any fixed B 2 Bk is a basis of all the sets R satisfying B µ R µ SnV .B/. It follows that B is a basis of a random sample R of size r with probability ¡n¡jBj¡k¢ ¡n¡k¢ r¡¡nj¢Bj · ¡nr¢ : r r We have jV .R/j D k if and only if R has some basis (equivalently, all its bases) in Bk , which gives Consequently, ¡n¡k¢ prob.Vr D k/ · bk ¡nr¢ ; r bk D jBk j: prob.Vr ¸ k/ · n¡r ¡n¡`¢ X b` ¡nr¢ ; `Dk r where we know that Since (see [ 24 ]) and because all bases have size at most d. Then we can further argue that n¡r X b` · `Dk µ n ¶ · d :D Xd µn¶ iD0 i we finally get, by substituting k D .ln.ne=d/ C ¸/ d.n=r /; prob.Vr ¸ k/ · 1 ¡ · ³ ne ´d µ d ³ ne ´d d .ln.ne=d/ C ¸/ d ¶r r ne d ³ ³ exp ¡ ln ´ ´ C ¸ d D exp .¡¸d/ : Multiple Pricing and Clarkson’s Reduction Scheme The simplex method [ 5 ] is usually the most efficient algorithm to solve linear programming problems in practice. Even in the theoretical setting, all known algorithms to solve general LP-type problems boil down to variants of the (dual) simplex method, when they are applied to linear programming [ 13 ]. In this section we introduce and analyze an algorithm in the general framework, which—although being new in its precise formulation— follows a well-known design paradigm, whose simplex counterpart is known as multiple pricing [ 5 ]. The idea of multiple pricing is to reduce a large problem to a (hopefully) small number of small problems. This can be useful in case the whole problem does not fit into main memory, but it also helps in general to reduce the cost of a single simplex iteration. Taking a slightly different approach, partial pricing [ 5 ] is a related technique following the same paradigm. Applications have been found in the context of very large-scale linear programming [ 3 ], but also in geometric optimization [ 14 ], [ 15 ]. We do not elaborate on those simplex techniques here; the reader may verify that the algorithm we are going to present is actually a variant of multiple pricing, when translated into simplex terminology. Consider an LP-type problem .S; '/ (not necessarily nondegenerate) of combinatorial dimension d, and assume we are given an algorithm lp type.G; B/ to compute for any subset G of S some basis BG of G, given a candidate basis B µ G. Of course, one can directly solve the problem of finding BS by calling lp type with the large set S and Algorithm 5.1. some basis B µ S. As we will see, an efficient alternative is provided by the following method, parameterized with a sample size r . We assume the initial basis B to be fixed for the rest of this section. lp type samplingr .S; B/: (* returns some basis BS of S *) choose R with jRj D r; R µ SnB at random G :D R [ B REPEAT B :Dlp type.G; B/ G :D G [ V .B/ UNTILV .B/ D ; RETURNB lp type sampling reduces the problem to several calls of lp type, and Fact 3.3(i) shows that if the procedure terminates, V .B/ D ; implies that B is a basis of S. Moreover, it must eventually terminate, because every round adds at least one element to G. The algorithm captures the spirit of Clarkson’s linear programming algorithm [ 8 ] (and its generalizations [ 1 ], [ 16 ]), but is simpler and more practical. To guarantee its theoretical complexity, Clarkson’s algorithm draws a random sample in every round, and it restarts a round whenever jV .B/j turns out to be too large. Thus, Algorithm 5.1 can be interpreted as the canonical simplification of Clarkson’s algorithm for practical use, where one observes that resampling and restarting are not necessary (and even decrease the efficiency). The general phenomenon behind this is that often the theoretically best algorithms are not competitive in practice, while the algorithms one actually chooses in an implementation cannot be analyzed. On the one hand this is due to the fact that the worst-case complexity is an inappropriate measure in many practical situations; on the other hand, sometimes algorithms used in practice are simply not understood, although they might allow a worst-case analysis. In the case of Algorithm 5.1 we have the fortunate situation that it combines efficiency in practice with provable time bounds (developed below). With the procedure lp type replaced by a call to a standard simplex implementation, the method has been successfully used in a linear programming code for geometric optimization [ 14 ], [ 15 ], without any further changes. In its original version, due to Clarkson, Algorithm 5.1 is a building-block of an ingenious linear-time algorithm for linear programming in constant dimension d [ 8 ], [ 16 ]. The theoretical analysis starts with a bound on the number of rounds. Observation 5.2 [ 8 ]. Fix some basis BS of S. Then in every round except the last one, V .B/ contains an element of BS. In particular, there are at most d C 1 rounds. Proof. Assume that BS is disjoint from V .B/. From Fact 3.3 and monotonicity we then get '.B/ D '.SnV .B// ¸ '.BS/ D '.S/, from which '.B/ D '.S/ follows. Locality then implies V .B/ D V .S/ D ;, which means that we are already in the last round. The critical parameter we are interested in is the size of G in the last round. If this is small, then all calls to lp type.G; B/ are cheap. We fix some notation for that. We define S0 :D Sn B, B being the initial candidate basis plugged into lp type sampling. By BR.i/; VR.i/; and G.Ri/ we denote the sets B, V . B/, and G computed in round i . Furthermore, we set G.R0/ D R [ B, while B.0/ and V .0/ are undefined. This means we have R R BR.i/ is a basis of G.Ri¡1/; VR.i/ D V .G.Ri¡1//: If the algorithm performs exactly ` rounds, sets with indices i > ` are defined to be the corresponding sets in round `. We will need a generalization of Observation 5.2. Lemma 5.3. For j < i · `, BR.i/ \ VR. j/ 6D ;. Proof. Assume on the contrary that BR.i/ \ VR. j/ D ;. As in the proof of Observation 5.2, Fact 3.3 and monotonicity then imply '.G.Rj¡1// D '.SnVR. j// ¸ '. BR.i// D '.G.Ri¡1//; a contradiction to the fact that '.G/ strictly increases in every round but the last. The following lemma is the crucial result. It interprets Algorithm 5.1 as an LP-type problem itself! Under this interpretation, the set G in the last round is essentially the set of violators of the initial sample R. Then the techniques of the previous sections (the sampling lemma and the tail estimates) can be applied to bound the expected size of jGj, and even get Chernoff-type bounds for the distribution of jGj. Lemma 5.4. For R µ S0 :D Sn B define Then the following holds: '0. R/ D ³'.G.R0//; '.G.R1//; : : : ; '.G.Rd¡1//´ : Before we go into the technical (although not difficult) proof, we derive the main result of this section, namely the analysis of Algorithm 5.1. This analysis is now merely a consequence of previous results. Theorem 5.5. For R µ S0, a random sample of size r , Choosing r D dpn=2 yields E .jG.Rd/j/ · E .jG.Rd/j/ · 2.d C 1/ r n 2 : Proof. The first inequality directly follows from the sampling lemma, applied to the LP-type problem .S0; '0/, together with part (ii) of the previous lemma. The second inequality is routine. The theorem shows that Algorithm lp type sampling reduces a problem of size n to at most d problems of expected size no more than O.dpn/. This explains the practical efficiency of multiple pricing and similar reduction schemes if d ¿ n. If .S; '/ is nondegenerate, we get the following tail estimate, using part (iii) of Lemma 5.4 and Theorem 4.8. Again, routine computations yield Theorem 5.6. If .S; '/ is a nondegenerate LP-type problem, then for R µ S0, a random sample of size r D dpn=2, and ¸ ¸ 0, µ prob jG.Rd/j ¸ .2 C ¸/.d C 1/ r n ¶ 2 · exp µ Theorem 5.7. If .S; '/ is a general LP-type problem, then for R µ S0, a random sample of size r D dp.n ln n/=2, and ¸ ¸ 0, à prob jG.Rd/j ¸ .3 C ¸/ .d C 1/ r n ln n ! 2 · exp µ ¡¸ µd C 1¶¶ 2 : We conclude this section with the proof of Lemma 5.4. We start by establishing an auxiliary claim: Claim. For any set Q with Q D R [¢ T µ S0 and i < d, implies '.G.Rj// D '.G.Qj//; j · i; To prove the claim, we proceed by induction on i , noting that the statements hold for i D 0 by the locality of '. Now assume the implications are true for j · i ¡ 1. Then we get Because '.G.Ri// D '.G.Qi//, the locality of ' implies G.Qi/ D G.Qi¡1/ [¢ V .i/ Q D G.Ri¡1/ ¢ [ T [¢ VR.i/ D G.Ri/ [¢ T : VQ.iC1/ D VR.iC1/; By the claim above, G.QiC1/ D G.RiC1/ [ fsg, and the monotonicity of ' implies .8/ This means, s 2 V .iC2/. R On the other hand, if s 62 V 0.R/, then the precondition of the claim holds for i D d ¡1, implying VR. jC1/ D VQ. jC1/ 63 s; j · d ¡ 1: get This means s 62 V .1/; : : : ; VR.d/. R To prove (i), we need to verify the monotonicity and locality (see Definition 3.1). Inequality (8) shows that '0.R/ · '0.R [ fsg/ in the lexicographic order, for all s 2 V 0.R/, and this implies monotonicity. For locality, assume R µ Q with '0.R/ D '0.Q/. From the claim and part (ii), we V 0.R/ D d [ V .i/ R D d [ V .i/ Q D V 0.Q/; iD1 iD1 and this is the required property. It remains to bound the combinatorial dimension of .S0; '0/. To this end we prove that '0.BR/ D '0.R/, for BR :D R \ We equivalently show that '.G.Rj// D '.G.BjR//, for j · d ¡ 1; using induction on j . For j D 0, we get hence '.G.Rj// D '.G.BjR/ [ RnBR/ D '.G.BjR//; .9/ because RnBR is disjoint from BR.1/, the basis of G.R0/. Hence, RnBR can be removed from G.R0/ without changing the '-value. Now assume the statement holds for j · d ¡ 2 and consider the case j D d ¡ 1. By the claim, we get G.Rj/ D G.BjR/ [ RnBR, so, as before, (9) follows, because RnBR is disjoint from the basis B. jC1/ of G.Rj/. R To bound the size of BR, we observe that jR \ B.i/ R j · d C 1 ¡ i; for all i · ` (the number of rounds in which V .B/ 6D ;). This follows from Lemma 5.3: B.i/ has at least one element in each of the i ¡ 1 sets VR.1/; : : : ; VR.i¡1/, which are in turn disjoint from R. Hence we get jBRj · ` X jR \ B.i/ R j · µd C 1¶ 2 Proof of part (iii). Nondegeneracy of .S0; '0/ follows if we can show that every set R µ S0 has the set BR as its unique basis. To this end we prove that whenever we have L µ R with '0.L/ D '0.R/, then BR µ L. Fix L µ R with '0.L/ D '0.R/, i.e. By the claim, this implies '.G.Ri// D '.G.Li//; i · d ¡ 1: G.Ri/ D G.Li/ [¢ .RnL/; i · d ¡ 1; and the nondegeneracy of ' yields that G.Ri/ and G.Li/ have the same unique basis BR.iC1/, for all i . It follows that G.Ld¡1/ contains d [ BR.i/; iD1 L \ d [ B.i/ R D R \ iD1 iD1 The latter equality holds because RnL is disjoint from G.Ld¡1/, thus in particular from the union of the BR.i/. The curious fact that—in the regular and nondegenerate case—the distribution of Vr does not depend on the actual LP-type problem, deserves a word of warning: namely, this property does not mean that all nondegenerate LP-type problems with given parameters so L contains 6. Conclusion n and d are equally difficult (or easy) to solve. On the contrary, because the random variable Vr does not depend on the actual problem, it does not carry any information about the difficulty of a particular problem. There are very easy problems (like d -smallest number), and very difficult ones (like linear programming). For example, Algorithm 5.1 never needs more than two rounds in the case of the d -smallest number, and for other easy LP-type problems characterized by the following property: for any sets B µ R such that '. B/ D '. R/, and for any set T , '. B [ T / D '. R [ T / holds. This means elements in Rn B can be “forgotten,” as they will not contribute to the final solution. The absence of this property is what makes linear programming and other problems difficult. In general, it seems that the combinatorial dimension of the LP-type problem .S0; '0/ derived from .S; '/ according to the definition in Lemma 5.4 is a more meaningful indicator of .S; '/’s difficulty than ±.S; '/ itself. For example, in the case of the d smallest number, we get ±.S0; '0/ D d , much less than the O .d2/ upper bound. This alternative notion of dimension needs to be further investigated. An open problem that remains is to improve the tail estimates in the case of degenerate LP-type problems. Here, the distribution of Vr typically depends on the concrete instance, and so does bk , the number of bases with k violators. Using only trivial bounds for the numbers bk , we have obtained the weaker estimate given by Theorem 4.10, indicating that this estimate might not be the final answer. Acknowledgment We thank the referee for carefully pointing out simplifications and suggesting improvements in the presentation. In particular, we are grateful for the question concerning the sharpness of our main Chernoff-type bound. [1] I. Adler and R. Shamir . A randomized scheme for speeding up algorithms for linear and convex programming with high constraints-to-variable ratio . Math. Programming , 61 : 39 - 52 , 1993 . [2] N. Amenta . Helly-type theorems and generalized linear programming . Discrete Comput. Geom. , 12 : 241 - 261 , 1994 . [3] R. E. Bixby , J. W. Gregory , I. J. Lustig , R. E. Marsten , and D. F. Shanno . Very large-scale linear programming: a case study in combining interior point and simplex methods . Oper. Res. , 40 ( 5 ): 885 - 897 , 1992 . [4] T. Chan . Backwards analysis of the Karger-Klein-Tarjan algorithm for minimum spanning trees . Inform. Process. Lett. , 67 : 303 - 304 , 1998 . [5] V. Chva ´tal. Linear Programming . Freeman, New York, 1983 . [6] K. L. Clarkson . New applications of random sampling in computational geometry . Discrete Comput. Geom. , 2 : 195 - 222 , 1987 . [7] K. L. Clarkson . A bound on local minima of arrangements that implies the upper bound theorem . Discrete Comput. Geom. , 10 : 427 - 233 , 1993 . [8] K. L. Clarkson . Las Vegas algorithms for linear and integer programming . J. Assoc. Comput. Mach. , 42 : 488 - 499 , 1995 . [9] K. L. Clarkson and P. W. Shor . Applications of random sampling in computational geometry, II. Discrete Comput . Geom. , 4 : 387 - 421 , 1989 . [10] T. H. Cormen , C. E. Leiserson , and R. L. Rivest . Introduction to Algorithms. The MIT Press, Cambridge, MA., 1990 . [11] M. de Berg , M. van Kreveld , M. Overmars , and O. Schwarzkopf . Computational Geometry: Algorithms and Applications . Springer-Verlag, Berlin, 1997 . [12] D. Dubhashi and D. Ranjan . Great(er) expectations . BRICS Newsletter , 5 : 11 - 13 , 1996 . [13] B. Ga ¨rtner. Randomized Optimization by Simplex-Type Methods . Ph.D. thesis , Freie Universita¨t, Berlin, 1995 . [14] B. Ga ¨rtner. Exact arithmetic at low cost-a case study in linear programming . Comput. Geom. Theory Appl. , 13 : 121 - 139 , 1999 . [15] B. Ga ¨rtner and S. Scho¨nherr. An efficient, exact and generic quadratic programming solver for geometric optimization . In Proc. 16th ACM Symp. Comput. Geom. , pages 110 - 118 , 2000 . [16] B. Ga ¨rtner and E. Welzl . Linear programming-randomization and abstract frameworks . In Proc. 13th Symp. Theoret. Aspects Comput. Sci., volume 1046 of Lecture Notes in Computer Science, pages 669 - 687 . Springer-Verlag, Berlin, 1996 . [17] R. L. Graham , D. E. Knuth , and O. Patashnik . Concrete Mathematics. Addison-Wesley , Reading, MA, 1989 . [18] L. J. Guibas , D. E. Knuth , and M. Sharir . Randomized incremental construction of Delaunay and Voronoi diagrams . Algorithmica , 7 : 381 - 413 , 1992 . [19] T. Hagerup and C. Ru¨b. A guided tour of Chernoff bounds . Inform. Process. Lett., 33 : 305 - 308 , 1990 . [20] S. Har-Peled . On the Expected Complexity of Random Convex Hulls . Technical Report 330 , School of Mathematical Sciences, Tel-Aviv University, 1998 . [21] D. Karger , P. N. Klein , and R. E. Tarjan . A randomized linear-time algorithm to find minimum spanning trees . J. Assoc. Comput. Mach. , 42 : 321 - 328 , 1995 . [22] J. Matousˇek . On geometric optimization with few violated constraints . Discrete Comput. Geom. , 14 : 365 - 384 , 1995 . [23] J. Matousˇek , M. Sharir, and E. Welzl . A subexponential bound for linear programming . Algorithmica , 16 : 498 - 516 , 1996 . [24] J. Matousˇek and J. Nesˇetˇril . Invitation to Discrete Mathematics. Oxford University Press, Oxford, 1998 . [25] R. Motwani and P. Raghavan. Randomized Algorithms . Cambridge University Press, New York, 1995 . [26] K. Mulmuley . A fast planar partition algorithm, I. J . Symbolic Comput., 10 ( 3-4 ): 253 - 280 , 1990 . [27] K. Mulmuley. Computational Geometry : An Introduction Through Randomized Algorithms . PrenticeHall, Englewood Cliffs, NJ, 1994 . [28] A. Re ´nyi and R. Sulanke . U¨ ber die konvexe Hu¨lle von n zufa¨llig gewa¨hlten Punkten. Z. Wahrsch ., 2 : 75 - 84 , 1963 . [29] R. Seidel . Small-dimensional linear programming and convex hulls made easy . Discrete Comput. Geom. , 6 : 423 - 434 , 1991 . [30] R. Seidel . Backwards analysis of randomized geometric algorithms . In J. Pach, editor, New Trends in Discrete and Computational Geometry , volume 10 of Algorithms and Combinatorics, pages 37 - 68 . Springer-Verlag, New York, 1993 . [31] R. Seidel . Personal communication, 1996 . [32] M. Sharir and E. Welzl . A combinatorial bound for linear programming and related problems . In Proc. 9th Symp. Theoret. Aspects Comput. Sci., volume 577 of Lecture Notes in Computer Science, pages 569 - 579 . Springer-Verlag, Berlin, 1992 . [33] E. Welzl . Smallest enclosing disks (balls and ellipsoids) . In H. Maurer, editor, New Results and New Trends in Computer Science , volume 555 of Lecture Notes in Computer Science, pages 359 - 370 . SpringerVerlag, Berlin, 1991 .


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs00454-001-0006-2.pdf

B. Gärtner, E. Welzl. A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization, Discrete & Computational Geometry, 2001, 569-590, DOI: 10.1007/s00454-001-0006-2