Game-theoretic Understanding of Adversarially Learned Features

Ren, Jie, Zhang, Die, Wang, Yisen, Chen, Lu, Zhou, Zhanpeng, Cheng, Xu, Wang, Xin, Chen, Yiting, Shi, Jie, Zhang, Quanshi

arXiv.org Artificial Intelligence 

This paper aims to understand adversarial attacks and defense from a new perspecitve, i.e. the signal-processing behaviors of DNNs. We novelly define the multi-order interaction in game theory, which satisfies six properties. With the multi-order interaction, we discover that adversarial attacks mainly affect high-order interactions to fool the DNN. Furthermore, we find that the robustness of adversarially trained DNNs comes from category-specific low-order interactions. Our findings provide more insights into and make a revision of previous understanding for the shape bias of adversarially learned features. Besides, the multi-order interaction can also explain the recoverability of adversarial examples.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found