Abstract
Abstract This article concerns the derivation and use of a measure of similarity between two hierarchical clusterings. The measure, Bk , is derived from the matching matrix, [mij ], formed by cutting the two hierarchical trees and counting the number of matching entries in the k clusters in each tree. The mean and variance of Bk are determined under the assumption that the margins of [mij ] are fixed. Thus, Bk represents a collection of measures for k = 2, …, n – 1. (k, Bk ) plots are found to be useful in portraying the similarity of two clusterings. Bk is compared to other measures of similarity proposed respectively by Baker (1974) and Rand (1971). The use of (k, Bk ) plots for studying clustering methods is explored by a series of Monte Carlo sampling experiments. An example of the use of (k, Bk ) on real data is given.
Keywords
Related Publications
Objective Criteria for the Evaluation of Clustering Methods
Abstract Many intuitively appealing methods have been suggested for clustering data, however, interpretation of their results has been hindered by the lack of objective criteria...
Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.
Abstract We present here a framework for the study of molecular variation within a single species. Information on DNA haplotype divergence is incorporated into an analysis of va...
Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion)
Markov chain Monte Carlo is a key computational tool in Bayesian statistics,\nbut it can be challenging to monitor the convergence of an iterative stochastic\nalgorithm. In this...
A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code
Two types of sampling plans are examined as alternatives to simple random sampling in Monte Carlo studies. These plans are shown to be improvements over simple random sampling w...
Hierarchical Clustering Schemes
Techniques for partitioning objects into optimally homogeneous groups on the basis of empirical measures of similarity among those objects have received increasing attention in ...
Publication Info
- Year
- 1983
- Type
- article
- Volume
- 78
- Issue
- 383
- Pages
- 553-569
- Citations
- 1433
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1080/01621459.1983.10478008