ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Hou, Zhichao, Gao, Weizhi, Shen, Yuchen, Wang, Feiyi, Liu, Xiaorui
–arXiv.org Artificial Intelligence
Transformer-based architectures have dominated various areas of machine learning in recent years. In this paper, we introduce a novel robust attention mechanism designed to enhance the resilience of transformer-based architectures. Crucially, this technique can be integrated into existing transformers as a plug-and-play layer, improving their robustness without the need for additional training or fine-tuning. Through comprehensive experiments and ablation studies, we demonstrate that our ProTransformer significantly enhances the robustness of transformer models across a variety of prediction tasks, attack mechanisms, backbone architectures, and data domains. Notably, without further fine-tuning, the ProTransformer consistently improves the performance of vanilla transformers by 19.5%, 28.3%, 16.1%, and 11.4% for BERT, ALBERT, DistilBERT, and RoBERTa, respectively, under the classical TextFooler attack. Furthermore, ProTransformer shows promising resilience in large language models (LLMs) against prompting-based attacks, improving the performance of T5 and LLaMA by 24.8% and 17.8%, respectively, and enhancing Vicuna by an average of 10.4% against the Jailbreaking attack. Beyond the language domain, ProTransformer also demonstrates outstanding robustness in both vision and graph domains.
arXiv.org Artificial Intelligence
Oct-30-2024
- Country:
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Government (0.93)
- Information Technology > Security & Privacy (1.00)
- Technology: