algorithm and hardness
Algorithms and Hardness for Learning Linear Thresholds from Label Proportions
We study the learnability of linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the feature-vector classifier is learnt from bags of feature-vectors and their corresponding observed label proportions which are satisfied by (i.e., consistent with) some unknown LTF. This problem has been investigated in recent work (Saket21) which gave an algorithm to produce an LTF that satisfies at least $(2/5)$-fraction of a satisfiable collection of bags, each of size $\leq 2$, by solving and rounding a natural SDP relaxation. However, this SDP relaxation is specific to at most $2$-sized bags and does not apply to bags of larger size. In this work we provide a fairly non-trivial SDP relaxation of a non-quadratic formulation for bags of size $3$.
Fairness in Streaming Submodular Maximization: Algorithms and Hardness
Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning algorithms, left unchecked, are known to exhibit bias: under- or over-representation of particular groups. This has made the design of fair machine learning algorithms increasingly important. In this work we address the question: Is it possible to create fair summaries for massive datasets? To this end, we develop the first streaming approximation algorithms for submodular maximization under fairness constraints, for both monotone and non-monotone functions. We validate our findings empirically on exemplar-based clustering, movie recommendation, DPP-based summarization, and maximum coverage in social networks, showing that fairness constraints do not significantly impact utility.
Review for NeurIPS paper: Fairness in Streaming Submodular Maximization: Algorithms and Hardness
Additional Feedback: As I mentioned before, I would encourage authors to change the initial framing of the paper, which promises to "giv[e] affirmative answers" to the question of whether "it is possible to create fair summaries for massive datasets". I think this part of the abstract might be better served describing the fairness constraint, so that the reader better understands this aspect. There also seems to be some odd formatting with bold and numbered text in the first paragraph on Section 6. 2. On Line 22, the sentence "submodularity is a natural way to capture the diminishing returns property of set functions, which holds for a variety of machine learning problems" is slightly awkward and misleading as written. It appears as if all set functions have an inherent diminishing returns property and that submodularity is capturing this. Of course, fixing this is just a matter of rephrasing.
Fairness in Streaming Submodular Maximization: Algorithms and Hardness
Submodular maximization has become established as the method of choice for the task of selecting representative and diverse summaries of data. However, if datapoints have sensitive attributes such as gender or age, such machine learning algorithms, left unchecked, are known to exhibit bias: under- or over-representation of particular groups. This has made the design of fair machine learning algorithms increasingly important. In this work we address the question: Is it possible to create fair summaries for massive datasets? To this end, we develop the first streaming approximation algorithms for submodular maximization under fairness constraints, for both monotone and non-monotone functions. We validate our findings empirically on exemplar-based clustering, movie recommendation, DPP-based summarization, and maximum coverage in social networks, showing that fairness constraints do not significantly impact utility.
Algorithms and Hardness for Learning Linear Thresholds from Label Proportions
We study the learnability of linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the feature-vector classifier is learnt from bags of feature-vectors and their corresponding observed label proportions which are satisfied by (i.e., consistent with) some unknown LTF. This problem has been investigated in recent work (Saket21) which gave an algorithm to produce an LTF that satisfies at least (2/5) -fraction of a satisfiable collection of bags, each of size \leq 2, by solving and rounding a natural SDP relaxation. However, this SDP relaxation is specific to at most 2 -sized bags and does not apply to bags of larger size. In this work we provide a fairly non-trivial SDP relaxation of a non-quadratic formulation for bags of size 3 .
Algorithms and hardness results for parallel large margin learning
We study the fundamental problem of learning an unknown large-margin halfspace in the context of parallel computation. Our main positive result is a parallel algorithm for learning a large-margin halfspace that is based on interior point methods from convex optimization and fast parallel algorithms for matrix computations. We show that this algorithm learns an unknown gamma-margin halfspace over n dimensions using poly(n,1/gamma) processors and runs in time O(1/gamma) O(log n). In contrast, naive parallel algorithms that learn a gamma-margin halfspace in time that depends polylogarithmically on n have Omega(1/gamma 2) runtime dependence on gamma. Our main negative result deals with boosting, which is a standard approach to learning large-margin halfspaces.
Algorithms and hardness results for parallel large margin learning
We study the fundamental problem of learning an unknown large-margin halfspace in the context of parallel computation. Our main positive result is a parallel algorithm for learning a large-margin halfspace that is based on interior point methods from convex optimization and fast parallel algorithms for matrix computations. We show that this algorithm learns an unknown gamma-margin halfspace over n dimensions using poly(n,1/gamma) processors and runs in time O(1/gamma) O(log n). In contrast, naive parallel algorithms that learn a gamma-margin halfspace in time that depends polylogarithmically on n have Omega(1/gamma 2) runtime dependence on gamma. Our main negative result deals with boosting, which is a standard approach to learning large-margin halfspaces.