Reviews: Hierarchical Clustering Beyond the Worst-Case

Neural Information Processing Systems 

This paper studies the problem of hierarchical clustering in a beyond-the-worst-case setting, where the data is a random graph generated by a Hierarchical Stochastic Block Model (HSBM). It proposes an SVD linkage algorithm and proves that in certain regimes it returns a solution that is a constant approximation to the cost function of hierarchical clustering proposed by Dasgupta. It also considers an algorithm that combines the SDP relaxation approach and recursive sparsest cut approach in previous works to get a constant approximation in larger regimes. Roughly speaking, the HSBM model considered assumes some underlying tree T* over k leaves (each leaf corresponding to a bottom cluster containing a certain number of vertices). Each node in the tree has some weight in (0,1) and a node's weight should not be larger than its descendant's.