Expected Probabilistic Hierarchies

Neural Information Processing Systems 

Hierarchical clustering has usually been addressed by discrete optimization using heuristics or continuous optimization of relaxed scores for hierarchies. In this work, we propose to optimize expected scores under a probabilistic model over hierarchies. EPH uses differentiable hierarchy sampling enabling end-to-end gradient descent based optimization, and an unbiased subgraph sampling approach to scale to large datasets. EPH outperforms all other approaches quantitatively and provides meaningful hierarchies in qualitative evaluations.