Learning Generative Models with Goal-conditioned Reinforcement Learning