Concept Bottleneck Large Language Models

Sun, Chung-En, Oikarinen, Tuomas, Ustun, Berk, Weng, Tsui-Wei

Dec-10-2024–arXiv.org Artificial Intelligence

We introduce the Concept Bottleneck Large Language Model (CB-LLM), a pioneering approach to creating inherently interpretable Large Language Models (LLMs). Unlike traditional black-box LLMs that rely on post-hoc interpretation methods with limited neuron function insights, CB-LLM sets a new standard with its built-in interpretability, scalability, and ability to provide clear, accurate explanations. We investigate two essential tasks in the NLP domain: text classification and text generation. In text classification, CB-LLM narrows the performance gap with traditional black-box models and provides clear interpretability. In text generation, we show how interpretable neurons in CB-LLM can be used for concept detection and steering text generation. Our CB-LLMs enable greater interaction between humans and LLMs across a variety of tasks -- a feature notably absent in existing LLMs. Large Language Models (LLMs) have become instrumental in advancing Natural Language Processing (NLP) tasks.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Dec-10-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria (0.04)
- North America
  - United States
    - Virginia > Portsmouth (0.04)
    - Florida (0.04)
    - Hawaii (0.04)
    - New Jersey (0.04)
    - Alabama > Calhoun County
      - Anniston (0.04)
    - New York > Kings County
      - New York City (0.04)
    - California > San Francisco County
      - San Francisco (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
    - Massachusetts
      - Plymouth County > Brockton (0.04)
      - Norfolk County (0.04)
  - Cuba > Guantánamo Province
    - Guantánamo (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Spain (0.04)
  - France (0.04)
  - United Kingdom > England
    - Greater Manchester > Manchester (0.04)
  - Greece > Attica
    - Athens (0.04)
- Asia
  - China (0.04)
  - Malaysia (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Middle East
    - Republic of Türkiye (0.04)
    - Saudi Arabia > Mecca Province
      - Jeddah (0.04)
    - Iraq > Baghdad Governorate
      - Baghdad (0.04)
  - India
    - Gujarat (0.04)
    - Maharashtra > Mumbai (0.04)
- Africa
  - South Africa (0.04)
  - Mozambique (0.04)

Genre:
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)

Industry:
- Media (1.00)
- Banking & Finance (1.00)
- Education (0.93)
- Law Enforcement & Public Safety > Terrorism (0.93)
- Government > Military (0.67)
- Leisure & Entertainment > Sports
  - Baseball (0.92)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found