Benchmarking Mental State Representations in Language Models