Goto

Collaborating Authors

 crown-ibp




Fast Certified Robust Training with Short Warmup

Neural Information Processing Systems

DNNs, such as adversarial training (Madry et al., 2018), provide no provable robustness guarantees, Both IBP and CROWN-IBP with loss fusion (Xu et al., 2020) have a per-batch training time For example, generalized CROWN-IBP in Xu et al. (2020) used 900 epochs for warmup and 2,000 He et al., 2015a), but prior works for certified training generally use weight initialization methods originally designed for standard DNN training, while certified training is essentially optimizing a different type of augmented network defined by robustness verification (Zhang et al., 2020). It can however hamper classification performance if too many neurons are dead.




Fast Certified Robust Training with Short Warmup

Neural Information Processing Systems

DNNs, such as adversarial training (Madry et al., 2018), provide no provable robustness guarantees, Both IBP and CROWN-IBP with loss fusion (Xu et al., 2020) have a per-batch training time For example, generalized CROWN-IBP in Xu et al. (2020) used 900 epochs for warmup and 2,000 He et al., 2015a), but prior works for certified training generally use weight initialization methods originally designed for standard DNN training, while certified training is essentially optimizing a different type of augmented network defined by robustness verification (Zhang et al., 2020). It can however hamper classification performance if too many neurons are dead.


Verified Training for Counterfactual Explanation Robustness under Data Shift

Meyer, Anna P., Zhang, Yuhao, Albarghouthi, Aws, D'Antoni, Loris

arXiv.org Artificial Intelligence

Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class. These explanations are commonly used to guide users' actions, e.g., by describing how a user whose loan application was denied can be approved for a loan in the future. Existing approaches generate CEs by focusing on a single, fixed model, and do not provide any formal guarantees on the CEs' future validity. When models are updated periodically to account for data shift, if the generated CEs are not robust to the shifts, users' actions may no longer have the desired impacts on their predictions. This paper introduces VeriTraCER, an approach that jointly trains a classifier and an explainer to explicitly consider the robustness of the generated CEs to small model shifts. VeriTraCER optimizes over a carefully designed loss function that ensures the verifiable robustness of CEs to local model updates, thus providing deterministic guarantees to CE validity. Our empirical evaluation demonstrates that VeriTraCER generates CEs that (1) are verifiably robust to small model updates and (2) display competitive robustness to state-of-the-art approaches in handling empirical model updates including random initialization, leave-one-out, and distribution shifts.


PECAN: A Deterministic Certified Defense Against Backdoor Attacks

Zhang, Yuhao, Albarghouthi, Aws, D'Antoni, Loris

arXiv.org Artificial Intelligence

Neural networks are vulnerable to backdoor poisoning attacks, where the attackers maliciously poison the training set and insert triggers into the test input to change the prediction of the victim model. Existing defenses for backdoor attacks either provide no formal guarantees or come with expensive-to-compute and ineffective probabilistic guarantees. We present PECAN, an efficient and certified approach for defending against backdoor attacks. The key insight powering PECAN is to apply off-the-shelf test-time evasion certification techniques on a set of neural networks trained on disjoint partitions of the data. We evaluate PECAN on image classification and malware detection datasets. Our results demonstrate that PECAN can (1) significantly outperform the state-of-the-art certified backdoor defense, both in defense strength and efficiency, and (2) on real back-door attacks, PECAN can reduce attack success rate by order of magnitude when compared to a range of baselines from the literature.


The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks

Frosio, Iuri, Kautz, Jan

arXiv.org Artificial Intelligence

Many defenses against adversarial attacks (\eg robust classifiers, randomization, or image purification) use countermeasures put to work only after the attack has been crafted. We adopt a different perspective to introduce $A^5$ (Adversarial Augmentation Against Adversarial Attacks), a novel framework including the first certified preemptive defense against adversarial attacks. The main idea is to craft a defensive perturbation to guarantee that any attack (up to a given magnitude) towards the input in hand will fail. To this aim, we leverage existing automatic perturbation analysis tools for neural networks. We study the conditions to apply $A^5$ effectively, analyze the importance of the robustness of the to-be-defended classifier, and inspect the appearance of the robustified images. We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label, and demonstrate the benefits of robustifier and classifier co-training. In our tests, $A^5$ consistently beats state of the art certified defenses on MNIST, CIFAR10, FashionMNIST and Tinyimagenet. We also show how to apply $A^5$ to create certifiably robust physical objects. Our code at https://github.com/NVlabs/A5 allows experimenting on a wide range of scenarios beyond the man-in-the-middle attack tested here, including the case of physical attacks.


Fast Certified Robust Training via Better Initialization and Shorter Warmup

Shi, Zhouxing, Wang, Yihan, Zhang, Huan, Yi, Jinfeng, Hsieh, Cho-Jui

arXiv.org Artificial Intelligence

Recently, bound propagation based certified adversarial defense have been proposed for training neural networks with certifiable robustness guarantees. Despite state-of-the-art (SOTA) methods including interval bound propagation (IBP) and CROWN-IBP have per-batch training complexity similar to standard neural network training, to reach SOTA performance they usually need a long warmup schedule with hundreds or thousands epochs and are thus still quite costly for training. In this paper, we discover that the weight initialization adopted by prior works, such as Xavier or orthogonal initialization, which was originally designed for standard network training, results in very loose certified bounds at initialization thus a longer warmup schedule must be used. We also find that IBP based training leads to a significant imbalance in ReLU activation states, which can hamper model performance. Based on our findings, we derive a new IBP initialization as well as principled regularizers during the warmup stage to stabilize certified bounds during initialization and warmup stage, which can significantly reduce the warmup schedule and improve the balance of ReLU activation states. Additionally, we find that batch normalization (BN) is a crucial architectural element to build best-performing networks for certified training, because it helps stabilize bound variance and balance ReLU activation states. With our proposed initialization, regularizers and architectural changes combined, we are able to obtain 65.03% verified error on CIFAR-10 ($\epsilon=\frac{8}{255}$) and 82.13% verified error on TinyImageNet ($\epsilon=\frac{1}{255}$) using very short training schedules (160 and 80 total epochs, respectively), outperforming literature SOTA trained with a few hundreds or thousands epochs.