On Agnostic PAC Learning in the Small Error Regime

Jun-13-2026, 22:31:04 GMT–Neural Information Processing Systems

Binary classification in the classic PAC model exhibits a curious phenomenon: Empirical Risk Minimization (ERM) learners are suboptimal in the realizable case yet optimal in the agnostic case. Roughly speaking, this owes itself to the fact that non-realizable distributions $\\mathcal{D}$ are more difficult to learn than realizable distributions -- even when one discounts a learner's error by $\\mathrm{err}(h^\\ast_\\mathcal{D})$, i.e., the error of the best hypothesis in $\\mathcal{H}$. Thus, optimal agnostic learners are permitted to incur excess error on (easier-to-learn) distributions $\\mathcal{D}$ for which $\\tau = \\mathrm{err}(h^\\ast_\\mathcal{D})$ is small.

artificial intelligence, machine learning, proceedings, (11 more...)

Neural Information Processing Systems

Jun-13-2026, 22:31:04 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.54)