Vision-Language Foundation Models as Effective Robot Imitators

Open in new window