LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning
Mao, Yansheng, Li, Jiaqi, Meng, Fanxu, Xiong, Jing, Zheng, Zilong, Zhang, Muhan
–arXiv.org Artificial Intelligence
Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT) for long context modeling, a novel framework that enhances LLM performance on long-context tasks by adapting model parameters to the context at test time. LIFT enables efficient processing of lengthy inputs without the computational burden of offline long-context adaptation, and can improve the long-context capabilities of arbitrary short-context models. The framework is further enhanced by integrating in-context learning and pre-LIFT supervised fine-tuning. The combination of in-context learning and LIFT enables short-context models like Llama 3 to handle arbitrarily long contexts and consistently improves their performance on popular long-context benchmarks like LooGLE and LongBench. We also provide a comprehensive analysis of the strengths and limitations of LIFT on long context understanding, offering valuable directions for future research. Large Language Models (LLMs), such as GPT-4 (Achiam et al., 2023), have revolutionized the field of natural language processing, driving breakthroughs in text generation and significant advancements in tasks like translation, summarization, and conversation. Lengthy sequences, which can span up to millions of tokens, are common in real-world applications including long books (Kočiskỳ et al., 2018), high-resolution videos (Wu et al., 2024; Tapaswi et al., 2016), and audio signals (Yang et al., 2024). Extending the context window allows models to capture dependencies across larger text spans and improve coherence, understanding, and accuracy in tasks that require reasoning over extended inputs.
arXiv.org Artificial Intelligence
Dec-18-2024