Clustering by Nonparametric Smoothing

Mar-12-2025–arXiv.org Machine Learning

A novel formulation of the clustering problem is introduced in which the task is expressed as an estimation problem, where the object to be estimated is a function which maps a point to its distribution of cluster membership. Unlike existing approaches which implicitly estimate such a function, like Gaussian Mixture Models (GMMs), the proposed approach bypasses any explicit modelling assumptions and exploits the flexible estimation potential of nonparametric smoothing. An intuitive approach for selecting the tuning parameters governing estimation is provided, which allows the proposed method to automatically determine both an appropriate level of flexibility and also the number of clusters to extract from a given data set. Experiments on a large collection of publicly available data sets are used to document the strong performance of the proposed approach, in comparison with relevant benchmarks from the literature. R code to implement the proposed approach is available from https://github.com/DavidHofmeyr/ I. Introduction Cluster analysis refers to the task of partitioning a set of data into groups (or clusters) in such a way that points within the same cluster tend to be more similar than points in different clusters.

artificial intelligence, formulation, machine learning, (18 more...)

arXiv.org Machine Learning

Mar-12-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.04)
- Europe > Austria
  - Vienna (0.14)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found