What is Wrong with Perplexity for Long-context Language Modeling?

Open in new window