A Unified Game-Theoretic Interpretation of Adversarial Robustness

Neural Information Processing Systems 

Furthermore, we find that the robustness of adversarially trained DNNs comes from category-specific low-order interactions.