Token Embeddings Violate the Manifold Hypothesis

Neural Information Processing Systems 

A full understanding of the behavior of a large language model (LLM) requires our grasp of its input token space. If this space differs from our assumptions, our comprehension of and conclusions about the LLM will likely be flawed.