Zero-Shot Coordination in Ad Hoc Teams with Generalized Policy Improvement and Difference Rewards

Open in new window