Emergence of Abstract State Representations in Embodied Sequence Modeling