Goto

Collaborating Authors

 kingman



Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable.


The Time-Marginalized Coalescent Prior for Hierarchical Clustering

Neural Information Processing Systems

We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman's coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two benefits: more flexible priors may be constructed and more efficient Gibbs type inference can be used. We demonstrate this on an example model for density estimation and show the TMC achieves competitive experimental results.


Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent, Jordan Boyd-Graber 2, Hal Daumè III 3, Z. Irene Ying

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution--Kingman's coalescent--provides a probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable. We present results on synthetic and real data that show the beta coalescent outperforms Kingman's coalescent and is qualitatively better at capturing data in bushy hierarchies.


Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable.


Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

Hu, Yuening, Ying, Jordan L., III, Hal Daume, Ying, Z. Irene

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable.


Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

Hu, Yuening, Ying, Jordan L., III, Hal Daume, Ying, Z. Irene

Neural Information Processing Systems

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable. We present results on both synthetic and real data that show the beta coalescent outperforms Kingman's coalescent on real datasets and is qualitatively better at capturing data in bushy hierarchies.


Inference in Kingman's Coalescent with Particle Markov Chain Monte Carlo Method

Chen, Yifei, Xie, Xiaohui

arXiv.org Machine Learning

March 22, 2018 Abstract We propose a new algorithm to do posterior sampling of Kingman's coalescent, based upon the Particle Markov Chain Monte Carlo methodology. Specifically, the algorithm is an instantiation of the Particle Gibbs Sampling method, which alternately samples coalescent times conditioned on coalescent tree structures, and tree structures conditioned on coalescent times via the conditional Sequential Monte Carlo procedure. We implement our algorithm as a C package, and demonstrate its utility via a parameter estimation task in population genetics on both single-and multiple-locus data. The experiment results show that the proposed algorithm performs comparable to or better than several well-developed methods. 1 Introduction Data shows hierarchical structure in many domains. For example, computer vision problems often involve hierarchical representation of images [Lee et al., 2009]. In text mining, documents can be modeled as hierarchical generative processes [Blei et al., 2003, Teh et al., 2006]. Algorithms that can effectively deal with hierarchical structure play an important role in uncovering the intrinsic structures of data.


The Time-Marginalized Coalescent Prior for Hierarchical Clustering

Boyles, Levi, Welling, Max

Neural Information Processing Systems

We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman’s coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two benefits: more flexible priors may be constructed and more efficient Gibbs type inference can be used. We demonstrate this on an example model for density estimation and show the TMC achieves competitive experimental results.


The Infinite Hierarchical Factor Regression Model

Rai, Piyush, Daume, Hal

Neural Information Processing Systems

We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.