Reviews: Correlation Clustering with Adaptive Similarity Queries

Neural Information Processing Systems 

This paper studies correlation clustering in an active learning setting where the learner does not query for all the {n choose 2} pairs of vertices. In the standard setting, the algorithm KwikCluster of Charikar et al. achieves a clustering cost of 3OPT where OPT is the cost of the best clustering. Here the cost of the clustering is defined as the number of pairs labeled 1 and put in different clusters plus the number of pairs labeled -1 and put in the same cluster. The main contribution of the paper is a variant ACC of KwikCluster that achieves a cost of 3OPT O(n 3/Q) where Q is an upper bound on the number of queries made by the algorithm. The authors also prove a matching lower bound on the error of any randomized algorithm that makes Q queries.