Linguistic Interpretability of Transformer-based Language Models: a systematic review

López-Otal, Miguel, Gracia, Jorge, Bernad, Jordi, Bobed, Carlos, Pitarch-Ballesteros, Lucía, Anglés-Herrero, Emma

Apr-14-2025–arXiv.org Artificial Intelligence

Language models based on the Transformer architecture achieve excellent results in many language-related tasks, such as text classification or sentiment analysis. However, despite the architecture of these models being well-defined, little is known about how their internal computations help them achieve their results. This renders these models, as of today, a type of 'black box' systems. There is, however, a line of research -- 'interpretability' -- aiming to learn how information is encoded inside these models. More specifically, there is work dedicated to studying whether Transformer-based models possess knowledge of linguistic phenomena similar to human speakers -- an area we call 'linguistic interpretability' of these models. In this survey we present a comprehensive analysis of 160 research works, spread across multiple languages and models -- including multilingual ones -- that attempt to discover linguistic information from the perspective of several traditional Linguistics disciplines: Syntax, Morphology, Lexico-Semantics and Discourse. Our survey fills a gap in the existing interpretability literature, which either not focus on linguistic knowledge in these models or present some limitations -- e.g. only studying English-based models. Our survey also focuses on Pre-trained Language Models not further specialized for a downstream task, with an emphasis on works that use interpretability techniques that explore models' internal representations.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Apr-14-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- Asia (1.00)
- North America > United States
  - California (0.67)

Genre:
- Research Report (1.00)
- Overview (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found