Generalizing Randomized Smoothing for Pointwise-Certified Defenses to Data Poisoning Attacks
We propose a method for making black-box functions provably robust to input manipulations. By training an ensemble of classifiers on randomly flipped training labels, we can use results from randomized smoothing to certify our classifier against label-flipping attacks--the larger the margin, the larger the certified radius of robustness. Using other types of noise allows for certifying robustness to other data poisoning attacks. Adversarial examples--targeted, human-imperceptible modifications to a test input that cause a deep network to fail catastrophically--have taken the machine learning community by storm, with a large body of literature dedicated to understanding and preventing this phenomenon (see these surveys). Understanding why deep networks consistently make these mistakes and how to fix them is one way researchers hope to make progress towards more robust artificial intelligence.
Oct-1-2020, 22:00:33 GMT
- Country:
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.40)
- Genre:
- Overview (0.34)
- Technology: