Automatic Reward Shaping from Confounded Offline Data