Abstract

Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have remained only loosely related. In this paper, we give an explicit theoretical connection between them. We show the generality of the weighted kernel k-means objective function, and derive the spectral clustering objective of normalized cut as a special case. Given a positive definite similarity matrix, our results lead to a novel weighted kernel k-means algorithm that monotonically decreases the normalized cut. This has important implications: a) eigenvector-based algorithms, which can be computationally prohibitive, are not essential for minimizing normalized cuts, b) various techniques, such as local search and acceleration schemes, may be used to improve the quality as well as speed of kernel k-means. Finally, we present results on several interesting data sets, including diametrical clustering of large gene-expression matrices and a handwriting recognition data set.

Keywords

Kernel (algebra)Cluster analysisSpectral clusteringString kernelKernel methodMathematicsSeparable spaceEigenvalues and eigenvectorsAlgorithmKernel embedding of distributionsMonotonic functionPattern recognition (psychology)Matrix (chemical analysis)Computer scienceArtificial intelligenceSupport vector machineDiscrete mathematics

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Pages
551-556
Citations
1184
Access
Closed

External Links

Social Impact

Altmetric
PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1184
OpenAlex

Cite This

Inderjit S. Dhillon, Yuqiang Guan, Brian Kulis (2004). Kernel k-means. , 551-556. https://doi.org/10.1145/1014052.1014118

Identifiers

DOI
10.1145/1014052.1014118