A Primer on the Inner Workings of Transformer-based Language Models

Ferrando, Javier, Sarti, Gabriele, Bisazza, Arianna, Costa-jussà, Marta R.

May-1-2024–arXiv.org Artificial Intelligence

The rapid progress of research aimed at interpreting the inner workings of advanced language models has highlighted a need for contextualizing the insights gained from years of work in this area. This primer provides a concise technical introduction to the current techniques used to interpret the inner workings of Transformer-based language models, focusing on the generative decoder-only architecture. We conclude by presenting a comprehensive overview of the known internal mechanisms implemented by these models, uncovering connections across popular approaches and active research directions in this area.

computational linguistic, language model, proceedings, (12 more...)

arXiv.org Artificial Intelligence

May-1-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- South America > Colombia
  - Meta Department > Villavicencio (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Washington > King County
      - Seattle (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - California
      - San Francisco County > San Francisco (0.14)
      - San Diego County > San Diego (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Austria > Vienna (0.14)
  - France (0.04)
  - Germany (0.04)
  - Spain > Galicia
    - Madrid (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Latvia > Lubāna Municipality
    - Lubāna (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
  - Poland > Masovia Province
    - Warsaw (0.04)
- Asia
  - Singapore (0.04)
  - Indonesia > Bali (0.04)
  - China > Hong Kong (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:
- Overview (1.00)
- Research Report > New Finding (0.45)

Industry:
- Health & Medicine (0.67)
- Government (0.45)
- Leisure & Entertainment (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Text Processing (1.00)
    - Large Language Model (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found