MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Rakotonirina, Nathanaël Carraz, Baroni, Marco

Feb-23-2024–arXiv.org Artificial Intelligence

We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes information to the LM by prefixing its regular input with a sequence of vectors, akin to soft prompts, without requiring LM finetuning. Tested on a task designed to probe a LM's ability to keep track of multiple fact updates, a MemoryPrompt-augmented LM outperforms much larger LMs that have access to the full input history. We also test MemoryPrompt on a long-distance dialogue dataset, where its performance is comparable to that of a model conditioned on the entire conversation history. In both experiments we also observe that, unlike full-finetuning approaches, MemoryPrompt does not suffer from catastrophic forgetting when adapted to new tasks, thus not disrupting the generalist capabilities of the underlying LM.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Feb-23-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Italy (0.14)
  - Spain (0.14)
- North America > Canada (0.14)

Genre:
- Research Report (0.51)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found