Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping

Open in new window