Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning

Open in new window