An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
–Neural Information Processing Systems
Transformer-based Large Language Models (LLMs) are typically pre-trained with a fixed context window size, e.g ., 4K tokens in Touvron et al. [2023a].
Neural Information Processing Systems
Feb-15-2026, 12:44:04 GMT