AITopics

Genre: Research Report > New Finding (0.59)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.62)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Neural Information Processing SystemsFeb-17-2026, 05:15:46 GMT

Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

We carry out our study by constructing a 1.4 trillion-token datastore named M

datastore, large language model, machine learning, (18 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsFeb-16-2026, 16:32:05 GMT

Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models

Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.

large language model, machine learning, natural language, (19 more...)

Country:

Europe > Austria > Vienna (0.14)
North America > Dominican Republic (0.04)
Europe > Bulgaria (0.04)
(10 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Fonseca, Rui, Martins, Bruno, Rocha, Gil

Text-Only Training for Image Captioning with Retrieval Augmentation and Modality Gap Correction

arXiv.org Artificial IntelligenceDec-5-2025

Image captioning has drawn considerable attention from the natural language processing and computer vision fields. Aiming to reduce the reliance on curated data, several studies have explored image captioning without any humanly-annotated image-text pairs for training, although existing methods are still outperformed by fully supervised approaches. This paper proposes TOMCap, i.e., an improved text-only training method that performs captioning without the need for aligned image-caption pairs. The method is based on prompting a pre-trained language model decoder with information derived from a CLIP representation, after undergoing a process to reduce the modality gap. W e specifically tested the combined use of retrieved examples of captions, and latent vector representations, to guide the generation process. Through extensive experiments, we show that TOMCap outperforms other training-free and text-only methods. W e also analyze the impact of different choices regarding the configuration of the retrieval-augmentation and modality gap reduction components.

caption, large language model, machine learning, (19 more...)

2512.04309

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Neural Information Processing SystemsOct-10-2025, 12:18:22 GMT

Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

We carry out our study by constructing a 1.4 trillion-token datastore named M

datastore, language model, opération, (15 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsOct-10-2025, 09:50:14 GMT

925869234d3aa2a3aad5f05b643974aa-Paper-Conference.pdf

computational linguistic, language model, rtd, (16 more...)

Country:

Europe > Austria > Vienna (0.14)
North America > Dominican Republic (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(10 more...)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications (0.94)

arXiv.org Artificial IntelligenceSep-30-2025

READER: Retrieval-Assisted Drafter for Efficient LLM Inference

Divilkovskiy, Maxim, Malygin, Vitaly, Zlobin, Sergey, Ilyushin, Stanislav, Isali, Sultan, Kalugin, Vasily, Aitassova, Nuriza, Yi, Fei, Zeng, Weidi

Autoregressive Language Models instantiate a factorized likelihood over token sequences, yet their strictly sequential decoding process imposes an intrinsic lower bound on inference latency. This bottleneck has emerged as a central obstacle to the scalable deployment of large-scale generative models. Existing acceleration techniques partially mitigate token-level latency by relying on auxiliary draft models or introducing an additional training phase, but fail to address the dominant memory and communication costs. We present READER, a provably lossless speculative decoding framework that bypasses the training of the auxiliary draft model. READER formalizes speculative decoding as a stochastic tree construction problem and exploits the empirical redundancy structure of natural language to generate high-probability candidate continuations. Our method revisits the problem of constructing draft trees, establishing substantial statistical improvements over stochastic draft-tree methods and providing a complexity-theoretic analysis that characterizes the optimality frontier of speculative decoding under bounded computation and memory resources. Beyond the single-sequence regime traditionally considered in prior work, we introduce a memory-optimal key-value cache-serving strategy that guarantees amortized sublinear overhead in the batch dimension, allowing READER to scale to realistic inference workloads. Comprehensive experiments demonstrate up to 6.13x wall-clock speedup on single-prompt inference and up to 5.92x on batched inference, consistently surpassing prior speculative decoding baselines, while preserving exact output equivalence, with even more pronounced gains in retrieval-augmented generation pipelines. Our results close a key gap between theoretical parallelism limits and practical LLM inference, suggesting a new standard for efficient deployment.

large language model, machine learning, natural language, (17 more...)

2508.09072

Country: North America > Mexico (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mamidala, Rushitha Santhoshi, Chhabra, Anshuman, Mali, Ankur

Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT

arXiv.org Artificial IntelligenceAug-28-2025

Prompt-based reasoning strategies such as Chain-of-Thought (CoT) and In-Context Learning (ICL) have become widely used for eliciting reasoning capabilities in large language models (LLMs). However, these methods rely on fragile, implicit mechanisms often yielding inconsistent outputs across seeds, formats, or minor prompt variations making them fundamentally unreliable for tasks requiring stable, interpretable reasoning. In contrast, automata-based neuro-symbolic frameworks like RetoMaton offer a more structured and trustworthy alternative by grounding retrieval in symbolic memory with deterministic transitions. In this work, we extend RetoMaton by replacing its global datastore with a local, task-adaptive Weighted Finite Automaton (WFA), constructed directly from external domain corpora. This local automaton structure promotes robust, context-aware retrieval while preserving symbolic traceability and low inference overhead. Unlike prompting, which entangles context and memory in opaque ways, our approach leverages the explicit structure of WFAs to provide verifiable and modular retrieval behavior, making it better suited for domain transfer and interoperability. We evaluate this local RetoMaton variant on two pretrained LLMs LLaMA-3.2-1B and Gemma-3-1B-PT across three reasoning tasks: TriviaQA (reading comprehension), GSM8K (multi-step math), and MMLU (domain knowledge). Compared to the base model and prompting-based methods, augmenting these setups with local RetoMaton consistently improves performance while enabling transparent and reproducible retrieval dynamics. Our results highlight a promising shift toward trustworthy, symbolic reasoning in modern LLMs via lightweight, automaton-guided memory.

large language model, machine learning, natural language, (16 more...)

2508.19271

Country: Africa (0.15)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Ibrahim, George, Ramos, Rita, Kementchedjhieva, Yova

CONCAP: Seeing Beyond English with Concepts Retrieval-Augmented Captioning

arXiv.org Artificial IntelligenceJul-29-2025

Multilingual vision-language models have made significant strides in image captioning, yet they still lag behind their English counterparts due to limited multilingual training data and costly large-scale model parameterization. Retrieval-augmented generation (RAG) offers a promising alternative by conditioning caption generation on retrieved examples in the target language, reducing the need for extensive multilingual training. However, multilingual RAG captioning models often depend on retrieved captions translated from English, which can introduce mismatches and linguistic biases relative to the source language. We introduce CONCAP, a multilingual image captioning model that integrates retrieved captions with image-specific concepts, enhancing the contextualization of the input image and grounding the captioning process across different languages. Experiments on the XM3600 dataset indicate that CONCAP enables strong performance on low- and mid-resource languages, with highly reduced data requirements. Our findings highlight the effectiveness of concept-aware retrieval augmentation in bridging multilingual performance gaps.

caption, large language model, machine learning, (19 more...)

2507.20411

Country: Asia (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceJul-10-2025

TokenShapley: Token Level Context Attribution with Shapley Value

Xiao, Yingtai, Zhu, Yuqing, Samyoun, Sirat, Zhang, Wanrong, Wang, Jiachen T., Du, Jian

Large language models (LLMs) demonstrate strong capabilities in in-context learning, but verifying the correctness of their generated responses remains a challenge. Prior work has explored attribution at the sentence level, but these methods fall short when users seek attribution for specific keywords within the response, such as numbers, years, or names. To address this limitation, we propose TokenShapley, a novel token-level attribution method that combines Shapley value-based data attribution with KNN-based retrieval techniques inspired by recent advances in KNN-augmented LLMs. By leveraging a precomputed datastore for contextual retrieval and computing Shapley values to quantify token importance, TokenShapley provides a fine-grained data attribution approach. Extensive evaluations on four benchmarks show that TokenShapley outperforms state-of-the-art baselines in token-level attribution, achieving an 11-23% improvement in accuracy.

attribution, large language model, machine learning, (20 more...)

2507.05261

Country: North America (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.69)