Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning

Mar-31-2025–arXiv.org Artificial Intelligence

We propose Rec-R1, a general reinforcement learning framework that bridges large language models (LLMs) with recommendation systems through closed-loop optimization. Unlike prompting and supervised fine-tuning (SFT), Rec-R1 directly optimizes LLM generation using feedback from a fixed black-box recommendation model, without relying on synthetic SFT data from proprietary models such as GPT-4o. This avoids the substantial cost and effort required for data distillation. To verify the effectiveness of Rec-R1, we evaluate it on two representative tasks: product search and sequential recommendation. Experimental results demonstrate that Rec-R1 not only consistently outperforms prompting- and SFT-based methods, but also achieves significant gains over strong discriminative baselines, even when used with simple retrievers such as BM25. Moreover, Rec-R1 preserves the general-purpose capabilities of the LLM, unlike SFT, which often impairs instruction-following and reasoning. These findings suggest Rec-R1 as a promising foundation for continual task-specific adaptation without catastrophic forgetting.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Mar-31-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States > Illinois
    - Champaign County > Urbana (0.04)
- Asia > Myanmar
  - Tanintharyi Region > Dawei (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Information Technology (0.68)
- Energy > Renewable (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found