Shortcut Learning of Large Language Models in Natural Language Understanding

Du, Mengnan, He, Fengxiang, Zou, Na, Tao, Dacheng, Hu, Xia

May-7-2023–arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved state-of-the-art performance on a series of natural language understanding tasks. However, these LLMs might rely on dataset bias and artifacts as shortcuts for prediction. This has significantly affected their generalizability and adversarial robustness. In this paper, we provide a review of recent developments that address the shortcut learning and robustness challenge of LLMs. We first introduce the concepts of shortcut learning of language models. We then introduce methods to identify shortcut learning behavior in language models, characterize the reasons for shortcut learning, as well as introduce mitigation solutions. Finally, we discuss key research challenges and potential research directions in order to advance the field of LLMs.

machine learning, natural language, shortcut, (18 more...)

arXiv.org Artificial Intelligence

May-7-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > Texas (0.46)

Genre:
- Overview (1.00)
- Research Report > New Finding (0.68)

Industry:
- Education (0.69)
- Information Technology > Security & Privacy (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found