In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss

Kuratov, Yuri, Bulatov, Aydar, Anokhin, Petr, Sorokin, Dmitry, Sorokin, Artyom, Burtsev, Mikhail

Feb-20-2024–arXiv.org Artificial Intelligence

This paper addresses the challenge of processing long documents using generative transformer models. To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts. Our evaluation, which includes benchmarks for GPT-4 and RAG, reveals that common methods are effective only for sequences up to $10^4$ elements. In contrast, fine-tuning GPT-2 with recurrent memory augmentations enables it to handle tasks involving up to $11\times 10^6$ elements. This achievement marks a substantial leap, as it is by far the longest input processed by any neural network model to date, demonstrating a significant improvement in the processing capabilities for long sequences.

garden, information, transformer, (14 more...)

arXiv.org Artificial Intelligence

Feb-20-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)
- Europe
  - Monaco (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
  - Russia > Central Federal District
    - Moscow Oblast > Moscow (0.04)
- Asia
  - Singapore (0.04)
  - Russia (0.04)
  - Indonesia > Bali (0.04)
  - China (0.04)
  - Japan > Kyūshū & Okinawa
    - Kyūshū > Nagasaki Prefecture > Nagasaki (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found