A Simple Linear Time (1+ &#8714;) -Approximation Algorithm for k-Means Clustering in Any Dimensions

Abstract

We present the first linear time (1+ε)-approximation algorithm for the k-means problem for fixed k and ε. Our algorithm runs in O(nd) time, which is linear in the size of the input. Another feature of our algorithm is its simplicity – the only technique involved is random sampling. 1.

Keywords

SimplicityAlgorithmSimple (philosophy)Cluster analysisApproximation algorithmTime complexityComputer scienceSimple random sampleSampling (signal processing)SIMPLE algorithmMathematicsFeature (linguistics)Artificial intelligencePhysics

Affiliated Institutions

Related Publications

A local search approximation algorithm for k-means clustering

Tapas Kanungo , David M. Mount , Nathan S. Netanyahu +3 more

In k-means clustering we are given a set of n data points in d-dimensional space ℜd and an integer k, and the problem is to determine a set of k points in ℜd, called centers, to...

2002 497 citations

Unsupervised K-Means Clustering Algorithm

Kristina P. Sinaga , Miin‐Shen Yang

The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsuper...

2020 IEEE Access 1917 citations

SLIC Superpixels Compared to State-of-the-Art Superpixel Methods

Radhakrishna Achanta , Anil Shaji , Kevin Smith +3 more

Computer vision applications have come to rely increasingly on superpixels in recent years, but it is not always clear what constitutes a good superpixel algorithm. In an effort...

2012 IEEE Transactions on Pattern Analysis... 8801 citations

Scaling clustering algorithms to large databases

Patricia Bradley , Usama M. Fayyad , Cory Reina

Practical clustering algorithms require multiple data scans to achieve convergence. For large databases, these scans become prohibitively expensive. We present a scalable cluste...

1998 707 citations

Weighted Graph Cuts without Eigenvectors A Multilevel Approach

Inderjit S. Dhillon , Yuqiang Guan , Brian Kulis

A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods....

2007 IEEE Transactions on Pattern Analysis... 1016 citations

Publication Info

Year: 2004
Type: article
Pages: 454-462
Citations: 241
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

A Simple Linear Time (1+ &#8714;) -Approximation Algorithm for k-Means Clustering in Any Dimensions

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

241

OpenAlex

Cite This

APA Style

                            
                                    Amit Kumar, 
                                
                                    Yogish Sabharwal, 
                                
                                    Subhankar Sen
                                
                            (2004). 
                            A Simple Linear Time (1+ &amp;#8714;) -Approximation Algorithm for k-Means Clustering in Any Dimensions. 
                            
                            , 454-462.
                            https://doi.org/10.1109/focs.2004.7

Identifiers

DOI: 10.1109/focs.2004.7