SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Lee, Changhun, Jin, Jun-gyu, Cho, Younghyun, Park, Eunhyeok
–arXiv.org Artificial Intelligence
In this work, we introduce a novel approach called Scaling to Emphasize Attention for Long-context retrieval (SEAL), which enhances the retrieval performance of large language models (LLMs) over extended contexts. Previous studies have shown that each attention head in LLMs has a unique functionality and collectively contributes to the overall behavior of the model. Similarly, we observe that specific heads are closely tied to long-context retrieval, showing positive or negative correlation with retrieval scores. Built on this insight, we propose a learning-based mechanism using zero-shot generated data to emphasize these heads, improving the model's performance in long-context retrieval tasks. By applying SEAL, we can achieve significant improvements in in-domain retrieval performance, including document QA tasks from LongBench, and considerable improvements in outof-domain cases. Additionally, when combined with existing training-free context extension techniques, SEAL extends the context limits of LLMs while maintaining highly reliable outputs, opening new avenues for research in this field. Large Language Models (LLMs) (Brown et al. (2020), Radford et al. (2019), Touvron et al. (2023)) are capable of rapidly generating high-quality answers to a wide range of questions by leveraging the diverse knowledge embedded in their vast number of parameters. However, in-depth analyses have revealed a common issue known as hallucination (Shuster et al. (2021), Lin et al. (2021), Ji et al. (2023)), where the models confidently produce inaccurate answers. Figure 1: Overview of the proposed SEAL and corresponding retrieval score improvements for LongChat-7B-v1.5-32K These approaches have significantly improved the reliability of LLMs by enabling them to reference existing information during generation. However, this trend has also highlighted a key limitation of LLMs: the constraint of their context window length.
arXiv.org Artificial Intelligence
Jan-25-2025
- Country:
- North America > United States
- California > San Francisco County > San Francisco (0.04)
- Europe
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Romania > Sud - Muntenia Development Region
- Asia > South Korea
- Gyeongsangbuk-do > Pohang (0.04)
- North America > United States
- Genre:
- Research Report
- New Finding (0.67)
- Promising Solution (0.48)
- Research Report
- Technology: