Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models

Open in new window