Toward a Theory of Tokenization in LLMs