Reviews: Data-Driven Clustering via Parameterized Lloyd's Families

Neural Information Processing Systems 

The paper proposes a generalization of the KMeans Algorithm by introducing non-negative parameters alpha and beta. The motivation is that different instances of clustering problems may cluster well according to different clustering objectives. The optimal parameter configuration of alpha and beta defines an optimal choice, from the proposed family of clustering algorithms. The paper offers several theoretical contributions. It provides guarantees for the number of samples necessary such that the empirically best parameter set yields clustering costs is within epsilon bounds of the optimal parameters. It provides an algorithm for the enumeration of all possible sets of initial centers for any alpha-interval.