Reviews: Query K-means Clustering and the Double Dixie Cup Problem
–Neural Information Processing Systems
This paper investigates the problem of active-semi-supervised clustering, by considering both noiseless (perfect oracle) and noisy (imperfect oracle) query responses. The authors provide probabilistic guarantees for low approximation errors to the true optimal k-means objective. The corresponding query complexities are substantially lower than in the existing literature. Importantly, as noted by the authors, their query complexity is independent of the size of the dataset. The main strength of the paper lies in the considerable technical rigour with which the subject has been handled.
Neural Information Processing Systems
Oct-7-2024, 04:15:07 GMT