Behavior Alignment via Reward Function Optimization