Computational Learning Theory
Multiclass versus Binary Differentially Private PAC Learning Marco Gaboardi Department of Computer Science Department of Computer Science Boston University
We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity that has a polynomial dependence on the multiclass Littlestone dimension and a poly-logarithmic dependence on the number of classes. This yields a doubly exponential improvement in the dependence on both parameters over learners from previous work. Our proof extends the notion of ฮจ-dimension defined in work of Ben-David et al. [5] to the online setting and explores its general properties.
A Theory of Optimistically Universal Online Learnability for General Concept Classes
We provide a full characterization of the concept classes that are optimistically universally online learnable with {0, 1} labels. The notion of optimistically universal online learning was defined in [Hanneke, 2021] in order to understand learnability under minimal assumptions. In this paper, following the philosophy behind that work, we investigate two questions, namely, for every concept class: (1) What are the minimal assumptions on the data process admitting online learnability?
Active Classification with Few Queries under Misspecification
We study pool-based active learning, where a learner has a large pool S of unlabeled examples and can adaptively ask a labeler questions to learn these labels. The goal of the learner is to output a labeling for S that can compete with the best hypothesis from a given hypothesis class H. We focus on halfspace learning, one of the most important problems in active learning. It is well known that in the standard active learning model, learning the labels of an arbitrary pool of examples labeled by some halfspace up to error ฯต requires at least ฮฉ(1/ฯต) queries. To overcome this difficulty, previous work designs simple but powerful query languages to achieve O(log(1/ฯต)) query complexity, but only focuses on the realizable setting where data are perfectly labeled by some halfspace. However, when labels are noisy, such queries are too fragile and lead to high query complexity even under the simple random classification noise model.
A Comparison with Other General MLCO Frameworks
Since obtaining ground-truth labels is non-trivial for NP-hard combinatorial tasks, there exist several efforts developing general MLCO methods without any requirement of ground-truth labels, including [8, 29, 30], our single-level baseline PPO-Single and our proposed PPO-BiHyb. Here we make a comparison concerning the model details and the capable problems of these methods. We would also like to discuss the limitations of the approaches including ours. For S2V-DQN [30] and NeuRewritter [8], training the RL model is challenging due to the sparse reward and large action space issues especially for large-scale problems. Specifically, for graphs with m nodes, the action space of S2V-DQN and NeuRewritter is m, and S2V-DQN requires O(m) actions to terminate for most problems when the number of decisions is proportional to m.
Non-Convex SGD Learns Halfspaces with Adversarial Label Noise
We study the problem of agnostically learning homogeneous halfspaces in the distribution-specific PAC model. For a broad family of structured distributions, including log-concave distributions, we show that non-convex SGD efficiently converges to a solution with misclassification error O(opt) + ษ, where opt is the misclassification error of the best-fitting halfspace. In sharp contrast, we show that optimizing any convex surrogate inherently leads to misclassification error of ฯ(opt), even under Gaussian marginals.