Inspecting and Editing Knowledge Representations in Language Models

Hernandez, Evan, Li, Belinda Z., Andreas, Jacob

May-22-2023–arXiv.org Artificial Intelligence

Neural language models (LMs) represent facts about the world described by text. Sometimes these facts derive from training data (in most LMs, a representation of the word "banana" encodes the fact that bananas are fruits). Sometimes facts derive from input text itself (a representation of the sentence "I poured out the bottle" encodes the fact that the bottle became empty). We describe REMEDI, a method for learning to map statements in natural language to fact encodings in an LM's internal representation system. REMEDI encodings can be used as knowledge editors: when added to LM hidden representations, they modify downstream generation to be consistent with new facts. REMEDI encodings may also be used as probes: when compared to LM representations, they reveal which properties LMs already attribute to mentioned entities, in some cases making it possible to predict when LMs will generate outputs that conflict with background knowledge or input text. REMEDI thus links work on probing, prompting, and LM editing, and offers steps toward general tools for fine-grained inspection and control of knowledge in LMs.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

May-22-2023

arXiv.org PDF

Add feedback

Country:
- Antarctica (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Japan (0.04)
  - Russia (0.04)
- Europe
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Italy > Veneto
    - Venice (0.04)
  - Netherlands (0.04)
  - Russia > Central Federal District
    - Moscow Oblast > Moscow (0.04)
  - Switzerland (0.04)
- North America
  - Canada (0.04)
  - Dominican Republic (0.04)
  - Mexico (0.04)
  - United States
    - Virginia > Accomack County (0.04)
    - New York > New York County
      - New York City (0.04)
    - California
      - San Francisco County > San Francisco (0.04)
      - Santa Barbara County > Santa Barbara (0.04)
    - Louisiana (0.04)
    - Arizona > Mohave County
      - Lake Havasu City (0.04)
    - Michigan (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - North Carolina (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Texas > Travis County
      - Austin (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Education
  - Curriculum > Subject-Specific Education (0.46)
  - Educational Setting > Higher Education (0.67)
- Health & Medicine > Therapeutic Area (0.46)
- Law (1.00)
- Leisure & Entertainment > Sports (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Problem Solving (1.00)
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.94)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found