Diffusion Policies Creating a Trust Region for Offline Reinforcement Learning
–Neural Information Processing Systems
Offline reinforcement learning (RL) leverages pre-collected datasets to train optimal policies. Diffusion Q-Learning (DQL), introducing diffusion models as a powerful and expressive policy class, significantly boosts the performance of offline RL. However, its reliance on iterative denoising sampling to generate actions slows down both training and inference.
Neural Information Processing Systems
Feb-14-2026, 03:44:44 GMT
- Country:
- North America > United States > Texas > Travis County > Austin (0.04)
- Genre:
- Research Report > Experimental Study (0.93)
- Technology: