Mars: Modeling Context & State Representations with Contrastive Learning for End-to-End Task-Oriented Dialog