Aligning Dialogue Agents with Global Feedback via Large Language Model Reward Decomposition

Open in new window