cnn-cert
Certified Interpretability Robustness for Class Activation Mapping
Gu, Alex, Weng, Tsui-Wei, Chen, Pin-Yu, Liu, Sijia, Daniel, Luca
Interpreting machine learning models is challenging but crucial for ensuring the safety of deep networks in autonomous driving systems. Due to the prevalence of deep learning based perception models in autonomous vehicles, accurately interpreting their predictions is crucial. While a variety of such methods have been proposed, most are shown to lack robustness. Yet, little has been done to provide certificates for interpretability robustness. Taking a step in this direction, we present CORGI, short for Certifiably prOvable Robustness Guarantees for Interpretability mapping. CORGI is an algorithm that takes in an input image and gives a certifiable lower bound for the robustness of the top k pixels of its CAM interpretability map. We show the effectiveness of CORGI via a case study on traffic sign data, certifying lower bounds on the minimum adversarial perturbation not far from (4-5x) state-of-the-art attack methods.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Robust AI: Protecting neural networks against adversarial attacks
In its latest annual report, filed with the Securities and Exchange Commission, tech giant Alphabet warned investors against the many challenges of artificial intelligence, following the lead of Microsoft, which issued similar warnings last August. Recent advances in deep learning and neural networks have created much hope about the possibilities that AI presents to various domains that were previously thought to be off the limits for computer software. But there's also concern about new threats AI will pose to different fields, especially where bad decisions can have very destructive results. We've already seen some of these threats manifest themselves in various ways, including biased algorithms, AI-based forgery and the spread of fake news during important events such as elections. The past few years have seen the development of a growing discussion around building trust in artificial intelligence and creating safeguards that prevent abuse and malicious behavior of AI models.
- Information Technology > Security & Privacy (0.87)
- Media > News (0.54)
- Government > Military (0.54)
CNN-Cert: An Efficient Framework for Certifying Robustness of Convolutional Neural Networks
Boopathy, Akhilan, Weng, Tsui-Wei, Chen, Pin-Yu, Liu, Sijia, Daniel, Luca
Verifying robustness of neural network classifiers has attracted great interests and attention due to the success of deep neural networks and their unexpected vulnerability to adversarial perturbations. Although finding minimum adversarial distortion of neural networks (with ReLU activations) has been shown to be an NP-complete problem, obtaining a non-trivial lower bound of minimum distortion as a provable robustness guarantee is possible. However, most previous works only focused on simple fully-connected layers (multilayer perceptrons) and were limited to ReLU activations. This motivates us to propose a general and efficient framework, CNN-Cert, that is capable of certifying robustness on general convolutional neural networks. Our framework is general -- we can handle various architectures including convolutional layers, max-pooling layers, batch normalization layer, residual blocks, as well as general activation functions; our approach is efficient -- by exploiting the special structure of convolutional layers, we achieve up to 17 and 11 times of speed-up compared to the state-of-the-art certification algorithms (e.g. Fast-Lin, CROWN) and 366 times of speed-up compared to the dual-LP approach while our algorithm obtains similar or even better verification bounds. In addition, CNN-Cert generalizes state-of-the-art algorithms e.g. Fast-Lin and CROWN. We demonstrate by extensive experiments that our method outperforms state-of-the-art lower-bound-based certification algorithms in terms of both bound quality and speed.