Zero-Shot Coordination in Ad Hoc Teams with Generalized Policy Improvement and Difference Rewards