hanneke
Agnostic Active Learning Is Always Better Than Passive Learning
We provide the first sharp characterization of the optimal first-order query complexity of agnostic active learning, and propose a new general active learning algorithm which achieves it. Remarkably, the optimal query complexity admits a leading term which is always strictly smaller than the sample complexity of passive supervised learning (by a factor proportional to the best-in-class error rate). This was not previously known to be possible. For comparison, in all previous general analyses, the leading term exhibits an additional factor, such as the disagreement coefficient or related complexity measures, and therefore only provides improvements over passive learning in restricted cases. The present work completely removes such factors from the leading term, implying that every concept class benefits from active learning in the non-realizable case. Whether such benefits are possible has been the driving question underlying the past two decades of research on the theory of agnostic active learning. This work finally settles this fundamental question.
Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness
We study the problem of learning in the presence of an adversary that can corrupt an ฮท fraction of the training examples with the goal of causing failure on a specific test point. In the realizable setting, prior work established that the optimal error under such instance-targeted poisoning attacks scales as ฮ(dฮท), where d is the VC dimension of the hypothesis class [Hanneke, Karbasi, Mahmoody, Mehalel, and Moran (NeurIPS 2022)]. In this work, we resolve the corresponding question in the agnostic setting. We show that the optimal excess error is eฮ( dฮท), answering one of the main open problems left by Hanneke et al. To achieve this rate, it is necessary to use randomized learners: Hanneke et al. showed that deterministic learners can be forced to suffer error close to 1 even under small amounts of poisoning.
On Agnostic PAC Learning in the Small Error Regime
Binary classification in the classic PAC model exhibits a curious phenomenon: Empirical Risk Minimization (ERM) learners are suboptimal in the realizable case yet optimal in the agnostic case. Roughly speaking, this owes itself to the fact that non-realizable distributions $\\mathcal{D}$ are more difficult to learn than realizable distributions -- even when one discounts a learner's error by $\\mathrm{err}(h^\\ast_\\mathcal{D})$, i.e., the error of the best hypothesis in $\\mathcal{H}$. Thus, optimal agnostic learners are permitted to incur excess error on (easier-to-learn) distributions $\\mathcal{D}$ for which $\\tau = \\mathrm{err}(h^\\ast_\\mathcal{D})$ is small.
Adversarial Resilience in Sequential Prediction via Abstention Anonymous Author(s) Affiliation Address email
We study the problem of sequential prediction in the stochastic setting with an1 adversary that is allowed to inject clean-label adversarial (or out-of-distribution)2 examples. Algorithms designed to handle purely stochastic data tend to fail in the3 presence of such adversarial examples, often leading to erroneous predictions. This4 is undesirable in many high-stakes applications such as medical recommendations,5 where abstaining from predictions on adversarial examples is preferable to mis-6 classification. On the other hand, assuming fully adversarial data leads to very7 pessimistic bounds that are often vacuous in practice.8 To capture this motivation, we propose a new model of sequential prediction that9 sits between the purely stochastic and fully adversarial settings by allowing the10 learner to abstain from making a prediction at no cost on adversarial examples.11
A Theory of Optimistically Universal Online Learnability for General Concept Classes
We provide a full characterization of the concept classes that are optimistically universally online learnable with {0, 1} labels. The notion of optimistically universal online learning was defined in [Hanneke, 2021] in order to understand learnability under minimal assumptions. In this paper, following the philosophy behind that work, we investigate two questions, namely, for every concept class: (1) What are the minimal assumptions on the data process admitting online learnability?
Universal Rates for Active Learning
In this work we study the problem of actively learning binary classifiers from a given concept class, i.e., learning by utilizing unlabeled data and submitting targeted queries about their labels to a domain expert. We evaluate the quality of our solutions by considering the learning curves they induce, i.e., the rate of