Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback

Open in new window