Explaining Learned Reward Functions with Counterfactual Trajectories

Open in new window