Parallelizing Model-based Reinforcement Learning Over the Sequence Length

Neural Information Processing Systems 

Recently, Model-based Reinforcement Learning (MBRL) methods have demonstrated stunning sample efficiency in various RL domains.However, achieving this extraordinary sample efficiency comes with additional training costs in terms of computations, memory, and training time.To address these challenges, we propose the Pa