AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

LLM-based IR-system for Bank Supervisors

Aarab, Ilias

arXiv.org Artificial IntelligenceAug-6-2025

Bank supervisors face the complex task of ensuring that new measures are consistently aligned with historical precedents. To address this challenge, we introduce a novel Information Retrieval (IR) System tailored to assist supervisors in drafting both consistent and effective measures. This system ingests findings from on-site investigations. It then retrieves the most relevant historical findings and their associated measures from a comprehensive database, providing a solid basis for supervisors to write well-informed measures for new findings. Utilizing a blend of lexical, semantic, and Capital Requirements Regulation (CRR) fuzzy set matching techniques, the IR system ensures the retrieval of findings that closely align with current cases. The performance of this system, particularly in scenarios with partially labeled data, is validated through a Monte Carlo methodology, showcasing its robustness and accuracy. Enhanced by a Transformer-based Denoising AutoEncoder for fine-tuning, the final model achieves a Mean Average Precision (MAP@100) of 0.83 and a Mean Reciprocal Rank (MRR@100) of 0.92. These scores surpass those of both standalone lexical models such as BM25 and semantic BERT-like models.

information retrieval, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.knosys.2024.112914

2508.02945

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance (1.00)
Law > Statutes (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

A Scoping Review of Natural Language Processing in Addressing Medically Inaccurate Information: Errors, Misinformation, and Hallucination

Sun, Zhaoyi, Yim, Wen-Wai, Uzuner, Ozlem, Xia, Fei, Yetisgen, Meliha

arXiv.org Artificial IntelligenceAug-5-2025

Objective: This review aims to explore the potential and challenges of using Natural Language Processing (NLP) to detect, correct, and mitigate medically inaccurate information, including errors, misinformation, and hallucination. By unifying these concepts, the review emphasizes their shared methodological foundations and their distinct implications for healthcare. Our goal is to advance patient safety, improve public health communication, and support the development of more reliable and transparent NLP applications in healthcare. Methods: A scoping review was conducted following PRISMA guidelines, analyzing studies from 2020 to 2024 across five databases. Studies were selected based on their use of NLP to address medically inaccurate information and were categorized by topic, tasks, document types, datasets, models, and evaluation metrics. Results: NLP has shown potential in addressing medically inaccurate information on the following tasks: (1) error detection (2) error correction (3) misinformation detection (4) misinformation correction (5) hallucination detection (6) hallucination mitigation. However, challenges remain with data privacy, context dependency, and evaluation standards. Conclusion: This review highlights the advancements in applying NLP to tackle medically inaccurate information while underscoring the need to address persistent challenges. Future efforts should focus on developing real-world datasets, refining contextual methods, and improving hallucination management to ensure reliable and transparent healthcare applications.

information retrieval, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jbi.2025.104866

2505.00008

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Vaccines (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(8 more...)

Add feedback

HeQ: a Large and Diverse Hebrew Reading Comprehension Benchmark

Cohen, Amir DN, Merhav, Hilla, Goldberg, Yoav, Tsarfaty, Reut

arXiv.org Artificial IntelligenceAug-5-2025

Current benchmarks for Hebrew Natural Language Processing (NLP) focus mainly on morpho-syntactic tasks, neglecting the semantic dimension of language understanding. To bridge this gap, we set out to deliver a Hebrew Machine Reading Comprehension (MRC) dataset, where MRC is to be realized as extractive Question Answering. The morphologically rich nature of Hebrew poses a challenge to this endeavor: the indeterminacy and non-transparency of span boundaries in morphologically complex forms lead to annotation inconsistencies, disagreements, and flaws in standard evaluation metrics. To remedy this, we devise a novel set of guidelines, a controlled crowdsourcing protocol, and revised evaluation metrics that are suitable for the morphologically rich nature of the language. Our resulting benchmark, HeQ (Hebrew QA), features 30,147 diverse question-answer pairs derived from both Hebrew Wikipedia articles and Israeli tech news. Our empirical investigation reveals that standard evaluation metrics such as F1 scores and Exact Match (EM) are not appropriate for Hebrew (and other MRLs), and we propose a relevant enhancement. In addition, our experiments show low correlation between models' performance on morpho-syntactic tasks and on MRC, which suggests that models designed for the former might underperform on semantics-heavy tasks. The development and exploration of HeQ illustrate some of the challenges MRLs pose in natural language understanding (NLU), fostering progression towards more and better NLU models for Hebrew and other MRLs.

large language model, machine learning, question answering, (22 more...)

arXiv.org Artificial Intelligence

2508.01812

Country:

Asia > Middle East > Israel (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry: Education > Assessment & Standards > Student Performance (0.61)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
(3 more...)

Add feedback

Segment First, Retrieve Better: Realistic Legal Search via Rhetorical Role-Based Queries

Nigam, Shubham Kumar, Dubey, Tanmay, Shallum, Noel, Bhattacharya, Arnab

arXiv.org Artificial IntelligenceAug-4-2025

Legal precedent retrieval is a cornerstone of the common law system, governed by the principle of stare decisis, which demands consistency in judicial decisions. However, the growing complexity and volume of legal documents challenge traditional retrieval methods. TraceRetriever mirrors real-world legal search by operating with limited case information, extracting only rhetorically significant segments instead of requiring complete documents. Our pipeline integrates BM25, Vector Database, and Cross-Encoder models, combining initial results through Reciprocal Rank Fusion before final re-ranking. Rhetorical annotations are generated using a Hierarchical BiLSTM CRF classifier trained on Indian judgments. Evaluated on IL-PCR and COLIEE 2025 datasets, TraceRetriever addresses growing document volume challenges while aligning with practical search constraints, reliable and scalable foundation for precedent retrieval enhancing legal research when only partial case knowledge is available.

information retrieval, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.00679

Country: Asia > India (0.47)

Genre: Research Report > New Finding (0.93)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.73)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AutoIndexer: A Reinforcement Learning-Enhanced Index Advisor Towards Scaling Workloads

Wang, Taiyi, Yoneki, Eiko

arXiv.org Artificial IntelligenceAug-1-2025

Efficiently selecting indexes is fundamental to database performance optimization, particularly for systems handling large-scale analytical workloads. While deep reinforcement learning (DRL) has shown promise in automating index selection through its ability to learn from experience, few works address how these RL-based index advisors can adapt to scaling workloads due to exponentially growing action spaces and heavy trial and error. To address these challenges, we introduce AutoIndexer, a framework that combines workload compression, query optimization, and specialized RL models to scale index selection effectively. By operating on compressed workloads, AutoIndexer substantially lowers search complexity without sacrificing much index quality. Extensive evaluations show that it reduces end-to-end query execution time by up to 95% versus non-indexed baselines. On average, it outperforms state-of-the-art RL-based index advisors by approximately 20% in workload cost savings while cutting tuning time by over 50%. These results affirm AutoIndexer's practicality for large and diverse workloads.

machine learning, reinforcement learning, workload, (16 more...)

arXiv.org Artificial Intelligence

2507.23084

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

RAVine: Reality-Aligned Evaluation for Agentic Search

Xu, Yilong, Long, Xiang, Zheng, Zhi, Gao, Jinhua

arXiv.org Artificial IntelligenceAug-1-2025

Agentic search, as a more autonomous and adaptive paradigm of retrieval augmentation, is driving the evolution of intelligent search systems. However, existing evaluation frameworks fail to align well with the goals of agentic search. First, the complex queries commonly used in current benchmarks often deviate from realistic user search scenarios. Second, prior approaches tend to introduce noise when extracting ground truth for end-to-end evaluations, leading to distorted assessments at a fine-grained level. Third, most current frameworks focus solely on the quality of final answers, neglecting the evaluation of the iterative process inherent to agentic search. To address these limitations, we propose RAVine -- a Reality-Aligned eValuation framework for agentic LLMs with search. RAVine targets multi-point queries and long-form answers that better reflect user intents, and introduces an attributable ground truth construction strategy to enhance the accuracy of fine-grained evaluation. Moreover, RAVine examines model's interaction with search tools throughout the iterative process, and accounts for factors of efficiency. We benchmark a series of models using RAVine and derive several insights, which we hope will contribute to advancing the development of agentic search systems. The code and datasets are available at https://github.com/SwordFaith/RAVine.

information retrieval, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2507.16725

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report (1.00)
Workflow (0.67)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

Add feedback

Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Bekhouche, Salah Eddine, Benlamoudi, Azeddine, Bounab, Yazid, Dornaika, Fadi, Hadid, Abdenour

arXiv.org Artificial IntelligenceAug-1-2025

ABSTRACT Arabic poses a particular challenge for natural language processing (NLP) and information retrieval (IR) due to its complex morphology, optional diacritics and the coexistence of Modern Standard Arabic (MSA) and various dialects. Despite the growing global significance of Arabic, it is still un-derrepresented in NLP research and benchmark resources. In this paper, we present an enhanced Dense Passage Retrieval (DPR) framework developed specifically for Arabic. At the core of our approach is a novel Attentive Relevance Scoring (ARS) that replaces standard interaction mechanisms with an adaptive scoring function that more effectively models the semantic relevance between questions and passages. Our method integrates pre-trained Arabic language models and architectural refinements to improve retrieval performance and significantly increase ranking accuracy when answering Arabic questions.

artificial intelligence, information retrieval, natural language, (14 more...)

arXiv.org Artificial Intelligence

2507.23404

Country:

Europe > Spain (0.28)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Add feedback

A Fisher's exact test justification of the TF-IDF term-weighting scheme

Sheridan, Paul, Ahmed, Zeyad, Farooque, Aitazaz A.

arXiv.org Artificial IntelligenceJul-31-2025

Term frequency-inverse document frequency, or TF-IDF for short, is arguably the most celebrated mathematical expression in the history of information retrieval. Conceived as a simple heuristic quantifying the extent to which a given term's occurrences are concentrated in any one given document out of many, TF-IDF and its many variants are routinely used as term-weighting schemes in diverse text analysis applications. There is a growing body of scholarship dedicated to placing TF-IDF on a sound theoretical foundation. Building on that tradition, this paper justifies the use of TF-IDF to the statistics community by demonstrating how the famed expression can be understood from a significance testing perspective. We show that the common TF-IDF variant TF-ICF is, under mild regularity conditions, closely related to the negative logarithm of the $p$-value from a one-tailed version of Fisher's exact test of statistical significance. As a corollary, we establish a connection between TF-IDF and the said negative log-transformed $p$-value under certain idealized assumptions. We further demonstrate, as a limiting case, that this same quantity converges to TF-IDF in the limit of an infinitely large document collection. The Fisher's exact test justification of TF-IDF equips the working statistician with a ready explanation of the term-weighting scheme's long-established effectiveness.

information retrieval, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/00031305.2025.2539241

2507.15742

Country: North America > United States (0.68)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)

Add feedback

InsurTech innovation using natural language processing

Dong, Panyi, Quan, Zhiyu

arXiv.org Machine LearningJul-30-2025

InsurTech refers to the use of state-of-the-art technology, including both emerging hardware and software, to address inefficiencies across the insurance value chain and further explore new opportunities to reshape traditional business operations. InsurTech encompasses a broad spectrum of technology-driven innovations, including, but not limited to, telematics, usage-based insurance, and the integration of Internet of Things (IoT) sensors. In this study, we focus on a specific class of InsurTech, an Insurtech data vendor, that provides insurance companies with next-generation data solutions. We leverage new and diverse external data sources, such as social media data and online content, to enrich the internal database, thereby empowering actuarial analytics and gaining more accurate insights into risk profiles and policyholder behavior. Specifically, by integrating alternative data sources beyond traditional information, insurance companies can uncover previously unrecognized risk factors, reduce bias in existing features, and identify more accurate risk exposures based on the operational characteristics of the insured entities.

information retrieval, large language model, machine learning, (24 more...)

arXiv.org Machine Learning

2507.21112

Country:

North America > United States > California (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > New Jersey (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Banking & Finance > Insurance (1.00)
Government > Regional Government > North America Government > United States Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(7 more...)

Add feedback

Privacy Artifact ConnecTor (PACT): Embedding Enterprise Artifacts for Compliance AI Agents

Fang, Chenhao, Peng, Yanqing, Rao, Rajeev, Sarmiento, Matt, Summer, Wendy, Pudota, Arya, Goncalves, Alex, Mola, Jordi, Robert, Hervé

arXiv.org Artificial IntelligenceJul-30-2025

Enterprise environments contain a heterogeneous, rapidly growing collection of internal artifacts related to code, data, and many different tools. Critical information for assessing privacy risk and ensuring regulatory compliance is often embedded across these varied resources, each with their own arcane discovery and extraction techniques. Therefore, large-scale privacy compliance in adherence to governmental regulations requires systems to discern the interconnected nature of diverse artifacts in a common, shared universe. We present Privacy Artifact ConnecT or (PACT), an embeddings-driven graph that links millions of artifacts spanning multiple artifact types generated by a variety of teams and projects. Powered by the state-of-the-art DRAGON embedding model, PACT uses a contrastive learning objective with light fine-tuning to link artifacts via their textual components such as raw metadata, ownership specifics, and compliance context. Experimental results show that PACT's fine-tuned model improves recall@1 from 18% to 53%, the query match rate from 9.6% to 69.7% when paired with a baseline AI agent, and the hitrate@1 from 25.7% to 44.9% for candidate selection in a standard recommender system.

information retrieval, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.21142

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Law > Statutes (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.75)

Add feedback