Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation

Open in new window