ABeyond-Worst-Case Analysis of Greedy k-means + +

Neural Information Processing Systems 

Greedy k-means++ is a generalization of k-means++ where, in each iteration, a new seed is greedily chosen among multiple ℓ 2points sampled, as opposed to a single seed being sampled in k-means++. While empirical studies consistently show the superior performance of greedy k-means++, making it a preferred method in practice, a discrepancy exists between theory and practice. No theoretical justification currently explains this improved performance. Indeed, the prevailing theory suggests that greedy k-means++ exhibits worse performance than k-means++ in worst-case scenarios. This paper presents an analysis demonstrating the outperformance of the greedy algorithm compared to k-means++ for a natural class of well-separated instances with exponentially decaying distributions, such as Gaussian, specifically when ℓ = lnk +Θ(1), a common parameter setting in practical applications.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found