GridToPix: Training Embodied Agents with Minimal Supervision