When does return-conditioned supervised learning work for offline reinforcement learning?

Open in new window