Provable Ordering and Continuity in Vision-Language Pretraining for Generalizable Embodied Agents