RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Jun-16-2026, 00:33:32 GMT–Neural Information Processing Systems

Existing end-to-end autonomous driving (AD) algorithms typically follow the Imitation Learning (IL) paradigm, which faces challenges such as causal confusion and an open-loop gap. In this work, we propose RAD, a 3DGS-based closed-loop Reinforcement Learning (RL) framework for end-to-end Autonomous Driving. By leveraging 3DGS techniques, we construct a photorealistic digital replica of the real physical world, enabling the AD policy to extensively explore the state space and learn to handle out-of-distribution scenarios through large-scale trial and error. To enhance safety, we design specialized rewards to guide the policy in effectively responding to safety-critical events and understanding realworld causal relationships. To better align with human driving behavior, we incorporate IL into RL training as a regularization term. We introduce a closed-loop evaluation benchmark consisting of diverse, previously unseen 3DGS environments. Compared to IL-based methods, RAD achieves stronger performance in most closed-loop metrics, particularly exhibiting a 3 lower collision rate. Abundant closed-loop results are presented in the supplementary material. Code is available at https://github.com/hustvl/RADfor

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Jun-16-2026, 00:33:32 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (0.46)
- Transportation > Ground
  - Road (0.56)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Robots > Autonomous Vehicles (0.70)
  - Machine Learning
    - Reinforcement Learning (0.85)
    - Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found