A simple example of Dirichlet process mixture inconsistency for the number of components
Miller, Jeffrey W., Harrison, Matthew T.
–Neural Information Processing Systems
For data assumed to come from a finite mixture with an unknown number of components, ithas become common to use Dirichlet process mixtures (DPMs) not only for density estimation, but also for inferences about the number of components. Thetypical approach is to use the posterior distribution on the number of clusters -- that is, the posterior on the number of components represented in the observed data. However, it turns out that this posterior is not consistent -- it does not concentrate at the true number of components. In this note, we give an elementary proofof this inconsistency in what is perhaps the simplest possible setting: a DPM with normal components of unit variance, applied to data from a "mixture" with one standard normal component. Further, we show that this example exhibits severe inconsistency: instead of going to 1, the posterior probability that there is one cluster converges (in probability) to 0.
Neural Information Processing Systems
Dec-31-2013