Grounding Video Models to Actions through Goal Conditioned Exploration

Open in new window