coalescent
e5f6ad6ce374177eef023bf5d0c018b6-Reviews.html
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper develops a model for multifurcating trees with edge lengths and observed data at the tree leaves; the model is based on the beta coalescent from the probability literature. The authors develop an MCMC inference scheme for their model, in which they draw on existing work that uses belief propagation to perform inference for the Kingman coalescent (an edge case of the beta coalescent in which all trees are binary). The particular challenge for inference here is that there are many more possible parent-child node relationships when parents can have multiple children (not just two). The authors seem to use a Dirichlet Process mixture model (DPMM) at each node to narrow down the space of possible children subsets to consider. As the authors note, even inference with the Kingman coalescent is a hard problem. In experiments, they compare to the Kingman coalescent and hierarchical agglomerative clustering. The Kingman coalescent is a popular modeling tool, so it is great to see a practical extension of the Kingman coalescent to the multifurcating case being explored for inference.
- Summary/Review (0.48)
- Research Report (0.46)
- Overview (0.35)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Maryland (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Health & Medicine (0.69)
- Food & Agriculture (0.46)
Explaining Indian Stock Market through Geometry of Scale free Networks
Yadav, Pawanesh, Sharma, Charu, Sahni, Niteesh
This paper presents an analysis of the Indian stock market using a method based on embedding the network in a hyperbolic space using Machine learning techniques. We claim novelty on four counts. First, it is demonstrated that the hyperbolic clusters resemble the topological network communities more closely than the Euclidean clusters. Second, we are able to clearly distinguish between periods of market stability and volatility through a statistical analysis of hyperbolic distance and hyperbolic shortest path distance corresponding to the embedded network. Third, we demonstrate that using the modularity of the embedded network significant market changes can be spotted early. Lastly, the coalescent embedding is able to segregate the certain market sectors thereby underscoring its natural clustering ability.
- Asia > India > Uttar Pradesh (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
The Time-Marginalized Coalescent Prior for Hierarchical Clustering
We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman's coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two benefits: more flexible priors may be constructed and more efficient Gibbs type inference can be used. We demonstrate this on an example model for density estimation and show the TMC achieves competitive experimental results.
- North America > United States > California > Orange County > Irvine (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent, Jordan Boyd-Graber 2, Hal Daumè III 3, Z. Irene Ying
Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution--Kingman's coalescent--provides a probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable. We present results on synthetic and real data that show the beta coalescent outperforms Kingman's coalescent and is qualitatively better at capturing data in bushy hierarchies.
- Asia > Middle East > Jordan (0.40)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Maryland (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Bayesian Agglomerative Clustering with Coalescents
We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman's coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over the state-of-the-art, and demonstrate our approach in document clustering and phylolinguistics.
Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent
Hu, Yuening, Ying, Jordan L., III, Hal Daume, Ying, Z. Irene
Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.40)
Angular separability of data clusters or network communities in geometrical space and its relevance to hyperbolic embedding
Muscoloni, Alessandro, Cannistraci, Carlo Vittorio
Analysis of 'big data' characterized by high-dimensionality such as word vectors and complex networks requires often their representation in a geometrical space by embedding. Recent developments in machine learning and network geometry have pointed out the hyperbolic space as a useful framework for the representation of this data derived by real complex physical systems. In the hyperbolic space, the radial coordinate of the nodes characterizes their hierarchy, whereas the angular distance between them represents their similarity. Several studies have highlighted the relationship between the angular coordinates of the nodes embedded in the hyperbolic space and the community metadata available. However, such analyses have been often limited to a visual or qualitative assessment. Here, we introduce the angular separation index (ASI), to quantitatively evaluate the separation of node network communities or data clusters over the angular coordinates of a geometrical space. ASI is particularly useful in the hyperbolic space - where it is extensively tested along this study - but can be used in general for any assessment of angular separation regardless of the adopted geometry. ASI is proposed together with an exact test statistic based on a uniformly random null model to assess the statistical significance of the separation. We show that ASI allows to discover two significant phenomena in network geometry. The first is that the increase of temperature in 2D hyperbolic network generative models, not only reduces the network clustering but also induces a 'dimensionality jump' of the network to dimensions higher than two. The second is that ASI can be successfully applied to detect the intrinsic dimensionality of network structures that grow in a hidden geometrical space.
- North America > United States (0.28)
- Europe > Germany > Saxony > Dresden (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (2 more...)
Machine learning meets network science: dimensionality reduction for fast and efficient embedding of networks in the hyperbolic space
Thomas, Josephine Maria, Muscoloni, Alessandro, Ciucci, Sara, Bianconi, Ginestra, Cannistraci, Carlo Vittorio
Complex network topologies and hyperbolic geometry seem specularly connected, and one of the most fascinating and challenging problems of recent complex network theory is to map a given network to its hyperbolic space. The Popularity Similarity Optimization (PSO) model represents - at the moment - the climax of this theory. It suggests that the trade-off between node popularity and similarity is a mechanism to explain how complex network topologies emerge - as discrete samples - from the continuous world of hyperbolic geometry. The hyperbolic space seems appropriate to represent real complex networks. In fact, it preserves many of their fundamental topological properties, and can be exploited for real applications such as, among others, link prediction and community detection. Here, we observe for the first time that a topological-based machine learning class of algorithms - for nonlinear unsupervised dimensionality reduction - can directly approximate the network's node angular coordinates of the hyperbolic model into a two-dimensional space, according to a similar topological organization that we named angular coalescence. On the basis of this phenomenon, we propose a new class of algorithms that offers fast and accurate coalescent embedding of networks in the hyperbolic space even for graphs with thousands of nodes.
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > Saxony > Dresden (0.04)
- (3 more...)