Policy Teaching via Data Poisoning in Learning from Human Preferences

Open in new window