RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Open in new window