Data ultrametricity and clusterability

Aug-28-2019–arXiv.org Machine Learning

Clustering is the prototypical unsupervised learning activity which consists in identifying cohesive and well-differentiated groups of records in data. A data set is clusterable if such groups exist; however, due to the variety in data distributions and the inadequate formalization of certain basic notions of clustering, determining data clusterability before applying specific clustering algorithms is a difficult task. Evaluating data clusterability before the application of clustering algorithms can be very helpful because clustering algorithms are expensive. However, many such evaluations are impractical because they are NPhard, as shown in [4]. Other notions define data as clusterable when the minimum between-cluster separation is greater than the maximum intra-cluster distance [13], or when each element is closer to all elements in its cluster than to all other data [7].

clusterability, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

Aug-28-2019

arXiv.org PDF

Add feedback

Country:
- North America > Canada > British Columbia (0.29)

Genre:
- Research Report (0.82)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning > Clustering (0.75)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found