Learning Reward Functions by Integrating Human Demonstrations and Preferences

Open in new window