Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation