Level Sets or Gradient Lines? A Unifying View of Modal Clustering
Arias-Castro, Ery, Qiao, Wanli
Up until the 1970's there were two main ways of clustering points in space. One of them, perhaps pioneered by Pearson [44], was to fit a (usually Gaussian) mixture to the data, and that being done, classify each data point -- as well as any other point available at a later date -- according to the most likely component in the mixture. The other one was based on a direct partitioning of the space, most notably by minimization of the average minimum squared distance to a center: the K-means problem, whose computational difficulty led to a number of famous algorithms [22, 31, 36, 37, 39] and likely played a role in motivating the development of hierarchical clustering [21, 25, 54, 63]. In the 1970's, two decidedly nonparametric approaches to clustering were proposed, both based on the topography given by the population density. Of course, in practice, the density is estimated, often by some form of kernel density estimation.
Sep-17-2021
- Country:
- North America > United States
- California (0.14)
- Virginia (0.14)
- North America > United States
- Genre:
- Research Report (0.40)
- Technology: