InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Wang, Boxin, Wang, Shuohang, Cheng, Yu, Gan, Zhe, Jia, Ruoxi, Li, Bo, Liu, Jingjing

Oct-14-2020–arXiv.org Artificial Intelligence

Large-scale pre-trained language models such as BERT and RoBERTa have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies, however, show that such BERTbased models are vulnerable facing the threats of textual adversarial attacks. We aim to address this problem from an information-theoretic perspective, and propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models. InfoBERT contains two mutual-information-based regularizers for model training: (i) an Information Bottleneck regularizer, which suppresses noisy mutual information between the input and the feature representation; and (ii) an Anchored Feature regularizer, which increases the mutual information between local stable features and global features. We provide a principled way to theoretically analyze and improve the robustness of language models in both standard and adversarial training. Extensive experiments demonstrate that InfoBERT achieves state-of-the-art robust accuracy over several adversarial datasets on Natural Language Inference (NLI) and Question Answering (QA) tasks. Self-supervised representation learning pre-trains good feature extractors from massive unlabeled data, which show promising transferability to various downstream tasks. Recent success includes large-scale pre-trained language models (e.g., BERT, RoBERTa, and GPT-3 (Devlin et al., 2019; Liu et al., 2019; Brown et al., 2020)), which have advanced state of the art over a wide range of NLP tasks such as NLI and QA, even surpassing human performance.

deep learning, neural network, robustness, (20 more...)

arXiv.org Artificial Intelligence

Oct-14-2020

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (0.35)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found