Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Feb-14-2020, 09:56:10 GMT–Neural Information Processing Systems

Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific \emph{lower bounds} on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp.

adversarial manipulation, classifier, formal guarantee, (1 more...)

Neural Information Processing Systems

Feb-14-2020, 09:56:10 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)