Abstract

This article proposes a split-merge Markov chain algorithm to address the problem of inefficient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an inappropriate clustering of data points. This article describes a Metropolis-Hastings procedure that can escape such local modes by splitting or merging mixture components. Our algorithm employs a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan. We demonstrate empirically that our method outperforms the Gibbs sampler in situations where two or more components are similar in structure.

Keywords

Gibbs samplingMarkov chain Monte CarloMetropolis–Hastings algorithmHierarchical Dirichlet processDirichlet distributionRejection samplingDirichlet processMarkov chainSlice samplingComputer scienceAlgorithmMonte Carlo methodMerge (version control)Hybrid Monte CarloMathematicsBayesian probabilityArtificial intelligenceLatent Dirichlet allocationMachine learningStatisticsTopic model

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
13
Issue
1
Pages
158-182
Citations
464
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

464
OpenAlex

Cite This

Sonia Jain, Radford M. Neal (2004). A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model. Journal of Computational and Graphical Statistics , 13 (1) , 158-182. https://doi.org/10.1198/1061860043001

Identifiers

DOI
10.1198/1061860043001