Behavior Alignment via Reward Function Optimization

Open in new window