Kernel k-means | RDL Research Database

Abstract

Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have remained only loosely related. In this paper, we give an explicit theoretical connection between them. We show the generality of the weighted kernel k-means objective function, and derive the spectral clustering objective of normalized cut as a special case. Given a positive definite similarity matrix, our results lead to a novel weighted kernel k-means algorithm that monotonically decreases the normalized cut. This has important implications: a) eigenvector-based algorithms, which can be computationally prohibitive, are not essential for minimizing normalized cuts, b) various techniques, such as local search and acceleration schemes, may be used to improve the quality as well as speed of kernel k-means. Finally, we present results on several interesting data sets, including diametrical clustering of large gene-expression matrices and a handwriting recognition data set.

Keywords

Kernel (algebra)Cluster analysisSpectral clusteringString kernelKernel methodMathematicsSeparable spaceEigenvalues and eigenvectorsAlgorithmKernel embedding of distributionsMonotonic functionPattern recognition (psychology)Matrix (chemical analysis)Computer scienceArtificial intelligenceSupport vector machineDiscrete mathematics

Affiliated Institutions

The University of Texas at Austin US

Related Publications

New spectral methods for ratio cut partitioning and clustering

L. Hagen , Andrew B. Kahng

Partitioning of circuit netlists in VLSI design is considered. It is shown that the second smallest eigenvalue of a matrix derived from the netlist gives a provably good approxi...

1992 IEEE Transactions on Computer-Aided D... 1245 citations

A Linear Spatial Correlation Model, with Applications to Positron Emission Tomography

Keith J. Worsley , Alan C. Evans , Stephen C. Strother +1 more

Abstract A simple spatial-correlation model is presented for repeated measures data. Correlation between observations on the same subject is assumed to decay as a linear functio...

1991 Journal of the American Statistical A... 39 citations

Stability-Based Validation of Clustering Solutions

Tilman Lange , Volker Röth , Mikio L. Braun +1 more

Data clustering describes a set of frequently employed techniques in exploratory data analysis to extract “natural” group structure in data. Such groupings need to be validated ...

2004 Neural Computation 508 citations

Author response: DNA deaminases induce break-associated mutation showers with implication of APOBEC3B and 3A in breast cancer kataegis

Ben Taylor , Serena Nik‐Zainal , Yee Ling Wu +6 more

Article Figures and data Abstract eLife digest Introduction Results Discussion Materials and methods References Decision letter Author response Article and author information Me...

2013 4 citations

Publication Info

Year: 2004
Type: article
Pages: 551-556
Citations: 1184
Access: Closed

External Links

View on DOI.org

Social Impact

Altmetric

Kernel k-means

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1184

OpenAlex

Cite This

APA Style

                            
                                    Inderjit S. Dhillon, 
                                
                                    Yuqiang Guan, 
                                
                                    Brian Kulis
                                
                            (2004). 
                            Kernel k-means. 
                            
                            , 551-556.
                            https://doi.org/10.1145/1014052.1014118

Identifiers

DOI: 10.1145/1014052.1014118