Hierarchical Clustering with Structural Constraints

Chatziafratis, Vaggos, Niazadeh, Rad, Charikar, Moses

May-23-2018–arXiv.org Artificial Intelligence

Hierarchical clustering (HC) is a widely used data analysis tool, ubiquitous in information retrieval, data mining, and machine learning (see a survey by Berkhin [2006]). This clustering technique represents a given dataset as a binary tree; each leaf represents an individual data point and each internal node represents a cluster on the leaves of its descendants. HC has become the most popular method for gene expression data analysis Eisen et al. [1998], and also has been used in the analysis of social networks Leskovec et al. [2014], Mann et al. [2008], bioinformatics Diez et al. [2015], image and text classification Steinbach et al. [2000], and even in analysis of financial markets Tumminello et al. [2010]. It is attractive because it provides richer information at all levels of granularity simultaneously, compared to more traditional flat clustering approaches like k-means or k-median. Recently, Dasgupta [2016] formulated HC as a combinatorial optimization problem, giving a principled way to compare the performance of different HC algorithms. This optimization viewpoint has since received a lot of attention Roy and Pokutta [2016], Charikar and Chatziafratis [2017], Cohen-Addad et al. [2017], Moseley and Wang [2017], Cohen-Addad et al. [2018] that has led not only to the development of new algorithms but also to theoretical justifications for the observed success of popular HC algorithms (e.g.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

May-23-2018

arXiv.org PDF

Add feedback

Country:
- Asia > Afghanistan > Parwan Province > Charikar (0.25)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found