Reviews: Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
–Neural Information Processing Systems
The paper proposes a method that alternates between learning a reward function and learning a policy. Algorithmically, the proposed method resembles inverse reinforcement learning/imitation learning. However, unlike existing methods that requires expert trajectories, the proposed method only requires goal states that the expert aims to reach. Experiments show that the proposed method reaches the goal states more accurately than an RL method with a naïve binary classification reward.
Neural Information Processing Systems
Oct-8-2024, 03:57:18 GMT
- Technology: