An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
–Neural Information Processing Systems
Transformer-based Large Language Models (LLMs) are typically pre-trained with a fixed context window size, e.g ., 4K tokens in Touvron et al. [2023a].
Neural Information Processing Systems
Oct-10-2025, 04:43:28 GMT