Certifiably Adversarially Robust Detection of Out-of-Distribution Data

Oct-11-2024, 05:02:51 GMT–Neural Information Processing Systems

Deep neural networks are known to be overconfident when applied to out-of-distribution (OOD) inputs which clearly do not belong to any class. This is a problem in safety-critical applications since a reliable assessment of the uncertainty of a classifier is a key property, allowing to trigger human intervention or to transfer into a safe state. In this paper, we are aiming for certifiable worst case guarantees for OOD detection by enforcing not only low confidence at the OOD point but also in an l_\infty -ball around it. For this purpose, we use interval bound propagation (IBP) to upper bound the maximal confidence in the l_\infty -ball and minimize this upper bound during training time. We show that non-trivial bounds on the confidence for OOD data generalizing beyond the OOD dataset seen at training time are possible.

certifiably adversarially robust detection, ood detection, out-of-distribution data, (2 more...)

Neural Information Processing Systems

Oct-11-2024, 05:02:51 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)