PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning Tao Y u
–Neural Information Processing Systems
Specifically, PlayVirtual predicts future states in a latent space based on the current state and action by a dynamics model and then predicts the previous states by a backward dynamics model, which forms a trajectory cycle.
Neural Information Processing Systems
Oct-9-2025, 14:35:50 GMT