dpmg
The Infinite Mixture of Infinite Gaussian Mixtures
Halid Z. Yerebakan, Bartek Rajwa, Murat Dundar
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined.
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- (2 more...)
The Infinite Mixture of Infinite Gaussian Mixtures
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster. The individual DPMGs are linked together through centering of their base distributions at the atoms of a higher level DP prior. Inference is performed by a collapsed Gibbs sampler that also enables partial parallelization. Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG.
The Infinite Mixture of Infinite Gaussian Mixtures
Halid Z. Yerebakan, Bartek Rajwa, Murat Dundar
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined.
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- (2 more...)
The Infinite Mixture of Infinite Gaussian Mixtures
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined.
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- (2 more...)
The Infinite Mixture of Infinite Gaussian Mixtures
Yerebakan, Halid Z., Rajwa, Bartek, Dundar, Murat
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster.
The Infinite Mixture of Infinite Gaussian Mixtures
Yerebakan, Halid Z., Rajwa, Bartek, Dundar, Murat
Dirichlet process mixture of Gaussians (DPMG) has been used in the literature for clustering and density estimation problems. However, many real-world data exhibit cluster distributions that cannot be captured by a single Gaussian. Modeling such data sets by DPMG creates several extraneous clusters even when clusters are relatively well-defined. Herein, we present the infinite mixture of infinite Gaussian mixtures (I2GMM) for more flexible modeling of data sets with skewed and multi-modal cluster distributions. Instead of using a single Gaussian for each cluster as in the standard DPMG model, the generative model of I2GMM uses a single DPMG for each cluster. The individual DPMGs are linked together through centering of their base distributions at the atoms of a higher level DP prior. Inference is performed by a collapsed Gibbs sampler that also enables partial parallelization. Experimental results on several artificial and real-world data sets suggest the proposed I2GMM model can predict clusters more accurately than existing variational Bayes and Gibbs sampler versions of DPMG.
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Natick (0.04)
- (2 more...)
A Semiparametric Bayesian Extreme Value Model Using a Dirichlet Process Mixture of Gamma Densities
In recent years extreme value mixture models have been proposed as a combination of a distribution with a "bulk part" below threshold and a generalized Pareto distribution (GPD) in the tail. Different distributions have been proposed for modelling the "bulk part" where the threshold is a parameter to be estimated. The first approach which allow us a transition between the bulk and tail parts is provided by Frigessi, Haug & Harvard (2003). Frigessi et al. (2003) uses a Weibull distribution in the bulk part, a GPD for the tail and the location-scale Cauchy cdf in the transition function and the authors use maximum likelihood estimation. However in the Frigessi et al. (2003) approach maximum likelihood estimation in the bulk part could produce multiple modes and hence some identifiability problems. Behrens, Lopez & Gammerman (2004) and Carreu & Bengio (2009) consider Gamma and Normal distributions respectively in the bulk part.
- North America > Puerto Rico > Gurabo > Gurabo (0.05)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)