SupplementaryMaterial: BetterSafeThanSorry: PreventingDelusiveAdversarieswith AdversarialTraining

Neural Information Processing Systems 

The initial learning rate is set to 0.1. A.2 AdversarialTraining Unless otherwise specified, we perform adversarial training to train robust classifiers by following Madry etal.[74]. Specifically,we train against aprojected gradient descent (PGD) adversary, starting from a random initial perturbation of the training data. Unless otherwise specified, we use the values of provided in Table 5 to train our models. We use 7 steps of PGD with a step size of/5. A.3 DelusiveAdversaries Six delusive attacks are considered to validate our proposed defense.