SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning

Jun-11-2026, 07:44:19 GMT–Neural Information Processing Systems

How to design reinforcement learning (RL) tasks that effectively unleash the reasoning capability of large language models (LLMs) remains an open question. Existing RL tasks (e.g., math, programming, and constructing reasoning tasks) suffer from three key limitations: (1) Scalability. They rely heavily on human annotation or expensive LLM synthesis to generate sufficient training data.

large language model, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Jun-11-2026, 07:44:19 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning (0.97)