Abstract
In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in Rd, one can compute a weighted set S ⊆ P, of size O(k ε-d log n), such that one can compute the k-median/means clustering on S instead of on P, and get an (1+ε)-approximation. As a result, we improve the fastest known algorithms for (1+ε)-approximate k-means and k-median. Our algorithms have linear running time for a fixed k and ε. In addition, we can maintain the (1+ε)-approximate k-median or k-means clustering of a stream when points are being only inserted, using polylogarithmic space and update time.
Keywords
Affiliated Institutions
Related Publications
How fast is the k-means method?
We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd's method) for k-means clustering. Our upper bounds are pol...
A local search approximation algorithm for k-means clustering
In k-means clustering we are given a set of n data points in d-dimensional space ℜd and an integer k, and the problem is to determine a set of k points in ℜd, called centers, to...
Better streaming algorithms for clustering problems
We study clustering problems in the streaming model, where the goal is to cluster a set of points by making one pass (or a few passes) over the data using a small amount of stor...
Approximation schemes for clustering problems
Let k be a fixed integer. We consider the problem of partitioning an input set of points endowed with a distance function into k clusters. We give polynomial time approximation ...
A Simple Linear Time (1+ ∊) -Approximation Algorithm for k-Means Clustering in Any Dimensions
We present the first linear time (1+ε)-approximation algorithm for the k-means problem for fixed k and ε. Our algorithm runs in O(nd) time, which is linear in the size of the in...
Publication Info
- Year
- 2004
- Type
- article
- Pages
- 291-300
- Citations
- 486
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/1007352.1007400