Goto

Collaborating Authors

 diverse element


Linear Relaxations for Finding Diverse Elements in Metric Spaces

Neural Information Processing Systems

Choosing a diverse subset of a large collection of points in a metric space is a fundamental problem, with applications in feature selection, recommender systems, web search, data summarization, etc. Various notions of diversity have been proposed, tailored to different applications. The general algorithmic goal is to find a subset of points that maximize diversity, while obeying a cardinality (or more generally, matroid) constraint. The goal of this paper is to develop a novel linear programming (LP) framework that allows us to design approximation algorithms for such problems. We study an objective known as {\em sum-min} diversity, which is known to be effective in many applications, and give the first constant factor approximation algorithm. Our LP framework allows us to easily incorporate additional constraints, as well as secondary objectives. We also prove a hardness result for two natural diversity objectives, under the so-called {\em planted clique} assumption. Finally, we study the empirical performance of our algorithm on several standard datasets.


Reviews: Linear Relaxations for Finding Diverse Elements in Metric Spaces

Neural Information Processing Systems

Although the provided novel algorithm looks impressive both from the theoretical prospective and in the experimental comparison, its substantiation has quite some room for improvement. The major point is the proof of Theorem 1: - it is unclear how the proof of the theorem follows from Lemmas 3 and 4, since none of these lemmas is related to the optimal solution of the considered diversity problem. I assume that the missing proposition is the one, which would establish connection between the considered linear program in lines 153-154 (by the way, it is very uncomfortable that the main formulation is not numbered and therefore can not be easily referenced) and the diversity problem. I believe that this connection may have the following format: if the linear program is equipped with integrality constraints (which is, all variables x_{ir}\in {0,1}), the resulting ILP is equivalent to the considered diversity problem. Indeed, the proof of such a proposition is not obvious for me as well.