AITopics | rankgpt

Collaborating Authors

rankgpt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation

Abdallah, Abdelrahman, Mozafari, Jamshid, Piryani, Bhawna, Jatowt, Adam

arXiv.org Artificial IntelligenceAug-26-2025

Large Language Models (LLMs) have transformed listwise document reranking by enabling global reasoning over candidate sets, yet single models often struggle to balance fine-grained relevance scoring with holistic cross-document analysis. We propose \textbf{De}ep\textbf{A}gent\textbf{R}ank (\textbf{\DeAR}), an open-source framework that decouples these tasks through a dual-stage approach, achieving superior accuracy and interpretability. In \emph{Stage 1}, we distill token-level relevance signals from a frozen 13B LLaMA teacher into a compact \{3, 8\}B student model using a hybrid of cross-entropy, RankNet, and KL divergence losses, ensuring robust pointwise scoring. In \emph{Stage 2}, we attach a second LoRA adapter and fine-tune on 20K GPT-4o-generated chain-of-thought permutations, enabling listwise reasoning with natural-language justifications. Evaluated on TREC-DL19/20, eight BEIR datasets, and NovelEval-2306, \DeAR surpasses open-source baselines by +5.1 nDCG@5 on DL20 and achieves 90.97 nDCG@10 on NovelEval, outperforming GPT-4 by +3.09. Without fine-tuning on Wikipedia, DeAR also excels in open-domain QA, achieving 54.29 Top-1 accuracy on Natural Questions, surpassing baselines like MonoT5, UPR, and RankGPT. Ablations confirm that dual-loss distillation ensures stable calibration, making \DeAR a highly effective and interpretable solution for modern reranking systems.\footnote{Dataset and code available at https://github.com/DataScienceUIBK/DeAR-Reranking.}.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.16998

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLM4Ranking: An Easy-to-use Framework of Utilizing Large Language Models for Document Reranking

Liu, Qi, Duan, Haozhe, Chen, Yiqun, Lu, Quanfeng, Sun, Weiwei, Mao, Jiaxin

arXiv.org Artificial IntelligenceApr-11-2025

Utilizing large language models (LLMs) for document reranking has been a popular and promising research direction in recent years, many studies are dedicated to improving the performance and efficiency of using LLMs for reranking. Besides, it can also be applied in many real-world applications, such as search engines or retrieval-augmented generation. In response to the growing demand for research and application in practice, we introduce a unified framework, \textbf{LLM4Ranking}, which enables users to adopt different ranking methods using open-source or closed-source API-based LLMs. Our framework provides a simple and extensible interface for document reranking with LLMs, as well as easy-to-use evaluation and fine-tuning scripts for this task. We conducted experiments based on this framework and evaluated various models and methods on several widely used datasets, providing reproducibility results on utilizing LLMs for document reranking. Our code is publicly available at https://github.com/liuqi6777/llm4ranking.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.07439

Country: Asia (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Zhuang, Shengyao, Ma, Xueguang, Koopman, Bevan, Lin, Jimmy, Zuccon, Guido

arXiv.org Artificial IntelligenceMar-7-2025

In this paper, we introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Existing document reranking methods based on large language models (LLMs) typically rely on prompting or fine-tuning LLMs to order or label candidate documents according to their relevance to a query. For Rank-R1, we use a reinforcement learning algorithm along with only a small set of relevance labels (without any reasoning supervision) to enhance the reasoning ability of LLM-based rerankers. Our hypothesis is that adding reasoning capabilities to the rerankers can improve their relevance assessement and ranking capabilities. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries. In particular, we find that Rank-R1 achieves effectiveness on in-domain datasets at par with that of supervised fine-tuning methods, but utilizing only 18\% of the training data used by the fine-tuning methods. We also find that the model largely outperforms zero-shot and supervised fine-tuning when applied to out-of-domain datasets featuring complex queries, especially when a 14B-size model is used. Finally, we qualitatively observe that Rank-R1's reasoning process improves the explainability of the ranking results, opening new opportunities for search engine results presentation and fruition.

rank-r1, reasoning process, reranker, (16 more...)

arXiv.org Artificial Intelligence

2503.06034

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Singapore (0.04)
Oceania > Australia > Queensland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Inference Scaling for Bridging Retrieval and Augmented Generation

Lee, Youngwon, Hwang, Seung-won, Campos, Daniel, Graliński, Filip, Yao, Zhewei, He, Yuxiong

arXiv.org Artificial IntelligenceDec-14-2024

Retrieval-augmented generation (RAG) has emerged as a popular approach to steering the output of a large language model (LLM) by incorporating retrieved contexts as inputs. However, existing work observed the generator bias, such that improving the retrieval results may negatively affect the outcome. In this work, we show such bias can be mitigated, from inference scaling, aggregating inference calls from the permuted order of retrieved contexts. The proposed Mixture-of-Intervention (MOI) explicitly models the debiased utility of each passage with multiple forward passes to construct a new ranking. We also show that MOI can leverage the retriever's prior knowledge to reduce the computational cost by minimizing the number of permutations considered and lowering the cost per LLM call. We showcase the effectiveness of MOI on diverse RAG tasks, improving ROUGE-L on MS MARCO and EM on HotpotQA benchmarks by ~7 points.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.10684

Country:

Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers

Chen, Shijie, Gutiérrez, Bernal Jiménez, Su, Yu

arXiv.org Artificial IntelligenceOct-3-2024

Information retrieval (IR) systems have played a vital role in modern digital life and have cemented their continued usefulness in this new era of generative AI via retrieval-augmented generation. With strong language processing capabilities and remarkable versatility, large language models (LLMs) have become popular choices for zero-shot re-ranking in IR systems. So far, LLM-based re-ranking methods rely on strong generative capabilities, which restricts their use to either specialized or powerful proprietary models. Given these restrictions, we ask: is autoregressive generation necessary and optimal for LLMs to perform re-ranking? We hypothesize that there are abundant signals relevant to re-ranking within LLMs that might not be used to their full potential via generation. To more directly leverage such signals, we propose in-context re-ranking (ICR), a novel method that leverages the change in attention pattern caused by the search query for accurate and efficient re-ranking. To mitigate the intrinsic biases in LLMs, we propose a calibration method using a content-free query. Due to the absence of generation, ICR only requires two ($O(1)$) forward passes to re-rank $N$ documents, making it substantially more efficient than generative re-ranking methods that require at least $O(N)$ forward passes. Our novel design also enables ICR to be applied to any LLM without specialized training while guaranteeing a well-formed ranking. Extensive experiments with two popular open-weight LLMs on standard single-hop and multi-hop information retrieval benchmarks show that ICR outperforms RankGPT while cutting the latency by more than 60% in practice. Through detailed analyses, we show that ICR's performance is specially strong on tasks that require more complex re-ranking signals. Our findings call for further exploration on novel ways of utilizing open-weight LLMs beyond text generation.

icr, query, rankgpt, (12 more...)

arXiv.org Artificial Intelligence

2410.02642

Country:

Europe > Greece (0.28)
Europe > Germany (0.14)
Europe > Ukraine (0.04)
(26 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Basketball (1.00)
Leisure & Entertainment > Sports > Baseball (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking

Jin, Can, Peng, Hongwu, Zhao, Shiyu, Wang, Zhenting, Xu, Wujiang, Han, Ligong, Zhao, Jiahui, Zhong, Kai, Rajasekaran, Sanguthevar, Metaxas, Dimitris N.

arXiv.org Artificial IntelligenceJun-20-2024

Large Language Models (LLMs) have significantly enhanced Information Retrieval (IR) across various modules, such as reranking. Despite impressive performance, current zero-shot relevance ranking with LLMs heavily relies on human prompt engineering. Existing automatic prompt engineering algorithms primarily focus on language modeling and classification tasks, leaving the domain of IR, particularly reranking, underexplored. Directly applying current prompt engineering algorithms to relevance ranking is challenging due to the integration of query and long passage pairs in the input, where the ranking complexity surpasses classification tasks. To reduce human effort and unlock the potential of prompt optimization in reranking, we introduce a novel automatic prompt engineering algorithm named APEER. APEER iteratively generates refined prompts through feedback and preference optimization. Extensive experiments with four LLMs and ten datasets demonstrate the substantial performance improvement of APEER over existing state-of-the-art (SoTA) manual prompts. Furthermore, we find that the prompts generated by APEER exhibit better transferability across diverse tasks and LLMs. Code is available at https://github.com/jincan333/APEER.

apeer, query, relevance, (16 more...)

arXiv.org Artificial Intelligence

2406.14449

Country:

North America > United States > Connecticut (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy

Chen, Yiqun, Liu, Qi, Zhang, Yi, Sun, Weiwei, Shi, Daiting, Mao, Jiaxin, Yin, Dawei

arXiv.org Artificial IntelligenceJun-17-2024

Large Language Models (LLMs) are increasingly employed in zero-shot documents ranking, yielding commendable results. However, several significant challenges still persist in LLMs for ranking: (1) LLMs are constrained by limited input length, precluding them from processing a large number of documents simultaneously; (2) The output document sequence is influenced by the input order of documents, resulting in inconsistent ranking outcomes; (3) Achieving a balance between cost and ranking performance is quite challenging. To tackle these issues, we introduce a novel documents ranking method called TourRank, which is inspired by the tournament mechanism. This approach alleviates the impact of LLM's limited input length through intelligent grouping, while the tournament-like points system ensures robust ranking, mitigating the influence of the document input sequence. We test TourRank with different LLMs on the TREC DL datasets and the BEIR benchmark. Experimental results show that TourRank achieves state-of-the-art performance at a reasonable cost.

rankgpt, tournament, tourrank, (13 more...)

arXiv.org Artificial Intelligence

2406.11678

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

LLatrieval: LLM-Verified Retrieval for Verifiable Generation

Li, Xiaonan, Zhu, Changtai, Li, Linyang, Yin, Zhangyue, Sun, Tianxiang, Qiu, Xipeng

arXiv.org Artificial IntelligenceMar-27-2024

Verifiable generation aims to let the large language model (LLM) generate text with supporting documents, which enables the user to flexibly verify the answer and makes the LLM's output more reliable. Retrieval plays a crucial role in verifiable generation. Specifically, the retrieved documents not only supplement knowledge to help the LLM generate correct answers, but also serve as supporting evidence for the user to verify the LLM's output. However, the widely used retrievers become the bottleneck of the entire pipeline and limit the overall performance. Their capabilities are usually inferior to LLMs since they often have much fewer parameters than the large language model and have not been demonstrated to scale well to the size of LLMs. If the retriever does not correctly find the supporting documents, the LLM can not generate the correct and verifiable answer, which overshadows the LLM's remarkable abilities. To address these limitations, we propose \LLatrieval (Large Language Model Verified Retrieval), where the LLM updates the retrieval result until it verifies that the retrieved documents can sufficiently support answering the question. Thus, the LLM can iteratively provide feedback to retrieval and facilitate the retrieval result to fully support verifiable generation. Experiments show that LLatrieval significantly outperforms extensive baselines and achieves state-of-the-art results.

llatrieval, llm, retriever, (15 more...)

arXiv.org Artificial Intelligence

2311.07838

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(7 more...)

Genre: Research Report (0.64)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback