Probing Geometry of Next Token Prediction Using Cumulant Expansion of the Softmax Entropy

Open in new window