Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Wang, Xinyu, Ma, Linrui, Huang, Jerry, Lu, Peng, Parthasarathi, Prasanna, Chang, Xiao-Wen, Chen, Boxing, Cui, Yufei

Mar-28-2025–arXiv.org Artificial Intelligence

Recent shifts in the space of large language model (LLM) research have shown an increasing focus on novel architectures to compete with prototypical Transformer-based models that have long dominated this space. Linear recurrent models have proven to be a viable competitor due to their computational efficiency. However, such models still demonstrate a sizable gap compared to Transformers in terms of in-context learning among other tasks that require recalling information from a context. In this work, we introduce __Resona__, a simple and scalable framework for augmenting linear recurrent models with retrieval. __Resona__~augments models with the ability to integrate retrieved information from the provided input context, enabling tailored behavior to diverse task requirements. Experiments on a variety of linear recurrent models demonstrate that __Resona__-augmented models observe significant performance gains on a variety of synthetic as well as real-world natural language tasks, highlighting its ability to act as a general purpose method to improve the in-context learning and language modeling abilities of linear recurrent LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Mar-28-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Rwanda
  - Kigali > Kigali (0.04)
- Asia
  - Indonesia > Bali (0.04)
  - Middle East
    - Qatar > Ad-Dawhah
      - Doha (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.14)
  - Singapore (0.04)
- Europe
  - Austria > Vienna (0.15)
  - France > Île-de-France
    - Paris > Paris (0.04)
  - Italy > Tuscany
    - Florence (0.04)
- North America
  - Canada
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.14)
    - Quebec > Montreal (0.04)
  - United States
    - California
      - Los Angeles County > Long Beach (0.04)
      - San Diego County > San Diego (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Maryland > Baltimore (0.04)
    - Michigan > Washtenaw County
      - Ann Arbor (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)