Geometry of Semantics in Next-Token Prediction: How Optimization Implicitly Organizes Linguistic Representations

Open in new window