Unsupervised Embedding of Hierarchical Structure in Euclidean Space

Oct-29-2020–arXiv.org Machine Learning

Hierarchical clustering aims to find an iterative grouping of increasingly large subsets of data, terminating when one cluster contains all of the data. This results in a tree structure, where each level determines a partition of the data. Organizing the data in such a way has many applications, including automated taxonomy construction and phylogeny reconstruction [6, 26, 54, 69, 87]. Motivated by these applications, we address the simultaneous goals of recovering an underlying clustering and building a hierarchical tree of the entire dataset. We imagine that we have access to many samples from each individual cluster (e.g., images of plants and animals), while we lack labels or any direct information about the hierarchical relationships. In this scenario, our objective is to correctly cluster the samples in each group and also build a dendrogram within and on top of the clusters that matches the natural supergroups. Despite significant effort, hierarchical clustering remains a challenging task, theoretically and empirically. Many objectives are NP-Hard [13, 14, 18, 19, 35, 59], and many theoretical algorithms may not be practical because they require solving a hard subproblem (e.g., sparsest cut) at each step [2, 4, 13, 14, 19, 51].

artificial intelligence, machine learning, ward, (17 more...)

arXiv.org Machine Learning

Oct-29-2020

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia
  - Middle East > Israel (0.04)
  - Afghanistan > Parwan Province
    - Charikar (0.04)

Genre:
- Research Report (0.64)
- Workflow (0.54)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Clustering (1.00)
  - Neural Networks (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found