Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

Jan-19-2025, 16:45:53 GMT–Neural Information Processing Systems

Finding classifiers robust to adversarial examples is critical for their safedeployment. Determining the robustness of the best possible classifier under agiven threat model for a fixed data distribution and comparing it to thatachieved by state-of-the-art training methods is thus an important diagnostictool. In this paper, we find achievable information-theoretic lower bounds onrobust loss in the presence of a test-time attacker for *multi-classclassifiers on any discrete dataset*. We provide a general framework for findingthe optimal 0-1 loss that revolves around the construction of a conflicthypergraph from the data and adversarial constraints. The prohibitive cost ofthis formulation in practice leads us to formulate other variants of the attacker-classifiergame that more efficiently determine the range of the optimal loss.

characterizing, multi-class classification, test-time attacker, (1 more...)

Neural Information Processing Systems

Jan-19-2025, 16:45:53 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.55)