LUMOS: Language-Conditioned Imitation Learning with World Models