DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Jun-13-2026, 15:52:27 GMT–Neural Information Processing Systems

Inference scaling empowers LLMs with unprecedented reasoning ability, with reinforcement learning as the core technique to elicit complex reasoning. However, key technical details of state-of-the-art reasoning LLMs are concealed (such as in OpenAI o1 blog and DeepSeek R1 technical report), thus the community still struggles to reproduce their RL training results.

large language model, machine learning, natural language, (12 more...)

Neural Information Processing Systems

Jun-13-2026, 15:52:27 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.59)