Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
Chen, Shijie, Gutiérrez, Bernal Jiménez, Su, Yu
–arXiv.org Artificial Intelligence
Information retrieval (IR) systems have played a vital role in modern digital life and have cemented their continued usefulness in this new era of generative AI via retrieval-augmented generation. With strong language processing capabilities and remarkable versatility, large language models (LLMs) have become popular choices for zero-shot re-ranking in IR systems. So far, LLM-based re-ranking methods rely on strong generative capabilities, which restricts their use to either specialized or powerful proprietary models. Given these restrictions, we ask: is autoregressive generation necessary and optimal for LLMs to perform re-ranking? We hypothesize that there are abundant signals relevant to re-ranking within LLMs that might not be used to their full potential via generation. To more directly leverage such signals, we propose in-context re-ranking (ICR), a novel method that leverages the change in attention pattern caused by the search query for accurate and efficient re-ranking. To mitigate the intrinsic biases in LLMs, we propose a calibration method using a content-free query. Due to the absence of generation, ICR only requires two ($O(1)$) forward passes to re-rank $N$ documents, making it substantially more efficient than generative re-ranking methods that require at least $O(N)$ forward passes. Our novel design also enables ICR to be applied to any LLM without specialized training while guaranteeing a well-formed ranking. Extensive experiments with two popular open-weight LLMs on standard single-hop and multi-hop information retrieval benchmarks show that ICR outperforms RankGPT while cutting the latency by more than 60% in practice. Through detailed analyses, we show that ICR's performance is specially strong on tasks that require more complex re-ranking signals. Our findings call for further exploration on novel ways of utilizing open-weight LLMs beyond text generation.
arXiv.org Artificial Intelligence
Oct-3-2024
- Country:
- Asia
- Pakistan (0.04)
- Indonesia > Bali (0.04)
- Bangladesh (0.04)
- Russia (0.04)
- China
- Hong Kong (0.04)
- Hubei Province > Wuhan (0.04)
- Timor-Leste (0.04)
- Middle East > Jordan (0.04)
- Singapore (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Philippines > Luzon
- National Capital Region > City of Manila (0.04)
- Atlantic Ocean > North Atlantic Ocean
- Hudson Bay (0.04)
- Europe
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- New York
- Monroe County > Rochester (0.04)
- New York County > New York City (0.04)
- Ohio (0.04)
- Oregon (0.04)
- Virginia (0.04)
- Washington > King County
- Seattle (0.04)
- Illinois > Cook County
- Oceania > New Zealand (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Government > Regional Government
- Health & Medicine
- Epidemiology (0.68)
- Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Oncology (1.00)
- Pulmonary/Respiratory Diseases (0.92)
- Leisure & Entertainment > Sports
- Baseball (1.00)
- Basketball (1.00)
- Media
- Technology: