RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning
–Neural Information Processing Systems
Offline reinforcement learning (RL) aims to find performant policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy optimisation within that model, have emerged as a promising approach to this problem.
Neural Information Processing Systems
Dec-24-2025, 09:03:00 GMT