Parameter Free Clustering with Cluster Catch Digraphs (Technical Report)

Dec-26-2019–arXiv.org Machine Learning

Clustering is one of the most challenging tasks in machine learning and pattern recognition, and perhaps, discovering the exact number of clusters of an unlabelled data set is the leading one. Many clustering methods find the clusters (or hidden classes) and the number of these clusters simultaneously (Frey and Dueck, 2007; Sajana et al., 2016). Although there exist methods for validating and comparing the quality of a partitioning of a data set, algorithms that provide the (estimated) number of clusters without any input parameter are still appealing. However, such methods or algorithms rely on other parameters viewed as the intensity, i.e. expected number of objects in a unit area. The value of the intensity parameter works as a threshold, and if the local intensity of the data set exceeds the threshold, it may indicate the existence of a possible cluster. However, the choice of such parameters is often a difficult task since different values of such parameters may drastically change the result of the algorithm. We use unsupervised adaptations of a family of vertex random digraphs, namely class cover catch digraphs (CCCDs), that showed relatively good performance in statistical pattern classification (Manukyan and Ceyhan, 2016; Priebe et al., 2003a). Unsupervised versions of CCCDs are called cluster catch digraphs (CCDs) (DeVinney, 2003; Marchette, 2004). Primarily, CCDs use statistics that require an intensity parameter to be specified or estimated.

algorithm, digraph, rk-ccd, (12 more...)

arXiv.org Machine Learning

Dec-26-2019

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- North America
  - United States
    - Maryland > Baltimore (0.04)
    - Florida > Orange County
      - Orlando (0.04)
    - California > Orange County
      - Irvine (0.04)
    - Massachusetts > Suffolk County
      - Boston (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
    - Utah > Salt Lake County
      - Salt Lake City (0.04)
    - Oregon > Multnomah County
      - Portland (0.04)
    - Alabama > Lee County
      - Auburn (0.04)
    - New Jersey > Hudson County
      - Hoboken (0.04)
    - New York > New York County
      - New York City (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > Austria
  - Vienna (0.14)

Genre:
- Research Report (0.81)

Industry:
- Health & Medicine (0.92)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found