Appendix
–Neural Information Processing Systems
Here are the five models that we used, in increasing order of adversarialrobustness: = 0,0.5,1.0,3.0,5.0. Three ImageNet-trained vision transformer (ViT) models [47] were obtained from pytorch-image-models [48]. Note that the "imagenet1k" suffixinthe model names does not mean the model wasonly trained on ImageNet1K. Observation: A vision transformer (ViT-S) indeed shows higher error consistency with ResNet-50 than with BagNet-9 (see Table 1). Further insights could be gained by testing successively more constrained versions of the samebasemodel.
Neural Information Processing Systems
Feb-11-2026, 03:29:18 GMT