AITopics

2510.06218

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceOct-7-2025

Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents

Wang, Yiding, Wei, Zhepei, Zhu, Xinyu, Meng, Yu

Enabling large language models (LLMs) to utilize search tools offers a promising path to overcoming fundamental limitations such as knowledge cutoffs and hallucinations. Recent work has explored reinforcement learning (RL) for training search-augmented agents that interleave reasoning and retrieval before answering. These approaches usually rely on outcome-based rewards (e.g., exact match), implicitly assuming that optimizing for final answers will also yield effective intermediate search behaviors. Our analysis challenges this assumption: we uncover multiple systematic deficiencies in search that arise under outcome-only training and ultimately degrade final answer quality, including failure to invoke tools, invalid queries, and redundant searches. To address these shortcomings, we introduce DeSA (Decoupling Search-and-Answering), a simple two-stage training framework that explicitly separates search optimization from answer generation. In Stage 1, agents are trained to improve search effectiveness with retrieval recall-based rewards. In Stage 2, outcome rewards are employed to optimize final answer generation. Across seven QA benchmarks, DeSA-trained agents consistently improve search behaviors, delivering substantially higher search recall and answer accuracy than outcome-only baselines. Notably, DeSA outperforms single-stage training approaches that simultaneously optimize recall and outcome rewards, underscoring the necessity of explicitly decoupling the two objectives.

large language model, machine learning, question answering, (21 more...)

2510.04695

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.88)

Arun, Abhinav, Harsh, Reetu Raj, Sarmah, Bhaskarjit, Pasquali, Stefano

FinReflectKG -- MultiHop: Financial QA Benchmark for Reasoning with Knowledge Graph Evidence

arXiv.org Artificial IntelligenceOct-6-2025

Multi-hop reasoning over financial disclosures is often a retrieval problem before it becomes a reasoning or generation problem: relevant facts are dispersed across sections, filings, companies, and years, and LLMs often expend excessive tokens navigating noisy context. Without precise Knowledge Graph (KG)-guided selection of relevant context, even strong reasoning models either fail to answer or consume excessive tokens, whereas KG-linked evidence enables models to focus their reasoning on composing already retrieved facts. We present FinReflectKG - MultiHop, a benchmark built on FinReflectKG, a temporally indexed financial KG that links audited triples to source chunks from S&P 100 filings (2022-2024). Mining frequent 2-3 hop subgraph patterns across sectors (via GICS taxonomy), we generate financial analyst style questions with exact supporting evidence from the KG. A two-phase pipeline first creates QA pairs via pattern-specific prompts, followed by a multi-criteria quality control evaluation to ensure QA validity. We then evaluate three controlled retrieval scenarios: (S1) precise KG-linked paths; (S2) text-only page windows centered on relevant text spans; and (S3) relevant page windows with randomizations and distractors. Across both reasoning and non-reasoning models, KG-guided precise retrieval yields substantial gains on the FinReflectKG - MultiHop QA benchmark dataset, boosting correctness scores by approximately 24 percent while reducing token utilization by approximately 84.5 percent compared to the page window setting, which reflects the traditional vector retrieval paradigm. Spanning intra-document, inter-year, and cross-company scopes, our work underscores the pivotal role of knowledge graphs in efficiently connecting evidence for multi-hop financial QA. We also release a curated subset of the benchmark (555 QA Pairs) to catalyze further research.

large language model, natural language, question answering, (19 more...)

2510.02906

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Law > Business Law (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.83)

arXiv.org Artificial IntelligenceOct-6-2025

DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering

Cheng, Rong, Liu, Jinyi, Zheng, Yan, Ni, Fei, Du, Jiazhen, Mao, Hangyu, Zhang, Fuzheng, Wang, Bo, Hao, Jianye

Multi-Hop Question Answering (MHQA) tasks permeate real-world applications, posing challenges in orchestrating multi-step reasoning across diverse knowledge domains. While existing approaches have been improved with iterative retrieval, they still struggle to identify and organize dynamic knowledge. To address this, we propose DualRAG, a synergistic dual-process framework that seamlessly integrates reasoning and retrieval. DualRAG operates through two tightly coupled processes: Reasoning-augmented Querying (RaQ) and progressive Knowledge Aggregation (pKA). They work in concert: as RaQ navigates the reasoning path and generates targeted queries, pKA ensures that newly acquired knowledge is systematically integrated to support coherent reasoning. This creates a virtuous cycle of knowledge enrichment and reasoning refinement. Through targeted fine-tuning, DualRAG preserves its sophisticated reasoning and retrieval capabilities even in smaller-scale models, demonstrating its versatility and core advantages across different scales. Extensive experiments demonstrate that this dual-process approach substantially improves answer accuracy and coherence, approaching, and in some cases surpassing, the performance achieved with oracle knowledge access. These results establish DualRAG as a robust and efficient solution for complex multi-hop reasoning tasks.

knowledge management, large language model, machine learning, (23 more...)

2504.18243

Country:

Asia (1.00)
Europe (0.93)
North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.93)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Neural Information Processing SystemsOct-3-2025, 04:07:37 GMT

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Patrick Lewis

However, their ability to access and precisely manipulate knowledge is still limited, and hence on knowledge-intensive tasks, their performance lags behind task-specific architectures.

computational linguistic, machine learning, question answering, (19 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Peru (0.14)
North America > Canada (0.04)
(12 more...)

Genre: Research Report (0.30)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.82)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Neural Information Processing SystemsOct-3-2025, 02:38:16 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","879" "Title:","A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors present a method for question answering about real world scenes - given as input a real world image and a question regarding objects in this image their system answers this question. For the question-answering engine the authors have generated a novel dataset with more than 12k question-answer pairs. The authors show an improved performance when using the multi-world approach but it didn't fully convinced me as for its quality since the accuracy (and WUPS) is pretty low either way. I would like to see more evidence and understanding of the importance and contribution of the multi-world approach.

machine learning, natural language, question answering, (17 more...)

Country: North America > Canada > Quebec > Montreal (0.24)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.77)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)

Mateusz Malinowski, Mario Fritz

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

Neural Information Processing SystemsOct-3-2025, 02:38:15 GMT

Neural Information Processing Systems http://nips.cc/

multi-world approach, real-world scene, uncertain input

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

arXiv.org Artificial IntelligenceOct-3-2025

Efficient Whole Slide Pathology VQA via Token Compression

Lyu, Weimin, Hu, Qingqiao, Qi, Kehan, Shi, Zhan, Huang, Wentao, Gupta, Saumya, Chen, Chao

Whole-slide images (WSIs) in pathology can reach up to 10,000 x 10,000 pixels, posing significant challenges for multimodal large language model (MLLM) due to long context length and high computational demands. Previous methods typically focus on patch-level analysis or slide-level classification using CLIP-based models with multi-instance learning, but they lack the generative capabilities needed for visual question answering (VQA). More recent MLLM-based approaches address VQA by feeding thousands of patch tokens directly into the language model, which leads to excessive resource consumption. To address these limitations, we propose Token Compression Pathology LLaVA (TCP-LLaVA), the first MLLM architecture to perform WSI VQA via token compression. TCP-LLaVA introduces a set of trainable compression tokens that aggregate visual and textual information through a modality compression module, inspired by the [CLS] token mechanism in BERT. Only the compressed tokens are forwarded to the LLM for answer generation, significantly reducing input length and computational cost. Experiments on ten TCGA tumor subtypes show that TCP-LLaVA outperforms existing MLLM baselines in VQA accuracy while reducing training resource consumption by a substantial margin.

large language model, machine learning, question answering, (19 more...)

2507.14497

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)

Neural Information Processing SystemsOct-2-2025, 17:43:26 GMT

RUBi: Reducing Unimodal Biases for Visual Question Answering

Remi Cadene, Corentin Dancette, Hedi Ben younes, Matthieu Cord, Devi Parikh

Neural Information Processing Systems http://nips.cc/

machine learning, natural language, question answering, (20 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.44)

Neural Information Processing SystemsOct-2-2025, 10:43:04 GMT

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Many recent datasets contain a variety of different data modalities, for instance, image, question, and answer data in visual question answering (VQA).

machine learning, natural language, question answering, (19 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)