Mixed-Density Diffuser: Efficient Planning with Non-Uniform Temporal Resolution
Stambaugh, Crimson, Rao, Rajesh P. N.
–arXiv.org Artificial Intelligence
Training a policy with online rollouts can be costly, dangerous, and sample-inefficient [1]. Alternatively, offline reinforcement learning (RL) involves a policy trained exclusively with pre-collected data. Extracting effective polices without exploration or feedback from the environment is challenging for conventional off-policy and even specialized offline RL algorithms [2, 3]. Approaches to of-fline RL are also frequently faced with the problem of incomplete or undirected demonstrations [4, 5, 6]. Offline algorithms must compose sub-trajectories from training data to generate advantageous behaviors. Another challenge is high-dimensionality and long horizons, which make accurate planning and behavior cloning difficult [1]. Finally, sparse rewards pose a challenge to many training algorithms as they hinder accurate credit assignment to actions [7]. Diffusion models have emerged as a powerful framework for expressing complex, multi-modal distributions [8, 9]. Leveraging this model class, diffusion policies generate high fidelity actions and use a value function for action selection [10, 11, 12].
arXiv.org Artificial Intelligence
Nov-13-2025
- Country:
- North America > United States > Washington > King County > Seattle (0.04)
- Genre:
- Research Report (1.00)
- Technology: