AITopics

Country:

Oceania > Australia > New South Wales (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > India > NCT > New Delhi (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.94)
Health & Medicine > Therapeutic Area (0.69)
Information Technology > Services (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Neural Information Processing SystemsFeb-17-2026, 23:40:22 GMT

Supplementary Material 1 Decoding using automatic differentiation inference ADVI

In the method section of our paper, we describe the general encoding-decoding paradigm. We provide a brief overview of our data preprocessing pipeline, which involves the following steps. We employ the method of Boussard et al. (2021) to estimate the location of Decentralized registration (Windolf et al., 2022) is applied to track and correct Figure 6: Motion drift in "good" and "bad" sorting recordings. "bad" sorting example, which is still affected by drift even after registration. To decode binary behaviors, such as the mouse's left or right choices, we utilize In this section, we provide visualizations to gain insights into the effectiveness of our proposed decoder.

artificial intelligence, decoder, machine learning, (19 more...)

Genre: Overview (0.34)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-27-2025, 03:03:01 GMT

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency effects, and neural clusters and correlates specialized for different domains and functionalities of WM. In the experiments, we also reveal some limitations in existing models to approximate human behavior. This dataset serves as a valuable resource for communities in cognitive psychology, neuroscience, and AI, offering a standardized framework to compare and enhance WM models, investigate WM's neural underpinnings, and develop WM models with human-like capabilities.

benchmarking human and ais, name change, working memory, (9 more...)

Genre: Research Report > New Finding (0.59)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Neural Information Processing SystemsDec-24-2025, 22:33:41 GMT

Learning to Decode: Reinforcement Learning for Decoding of Sparse Graph-Based Channel Codes

We show in this work that reinforcement learning can be successfully applied to decoding short to moderate length sparse graph-based channel codes. Specifically, we focus on low-density parity check (LDPC) codes, which for example have been standardized in the context of 5G cellular communication systems due to their excellent error correcting performance. These codes are typically decoded via belief propagation iterative decoding on the corresponding bipartite (Tanner) graph of the code via flooding, i.e., all check and variable nodes in the Tanner graph are updated at once. In contrast, in this paper we utilize a sequential update policy which selects the optimum check node (CN) scheduling in order to improve decoding performance. In particular, we model the CN update process as a multi-armed bandit process with dependent arms and employ a Q-learning scheme for optimizing the CN scheduling policy. In order to reduce the learning complexity, we propose a novel graph-induced CN clustering approach to partition the state space in such a way that dependencies between clusters are minimized. Our results show that compared to other decoding approaches from the literature, the proposed reinforcement learning scheme not only significantly improves the decoding performance, but also reduces the decoding complexity dramatically once the scheduling policy is learned.

learning, reinforcement learning, sparse graph-based channel code, (6 more...)

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceDec-1-2025

Cacheback: Speculative Decoding With Nothing But Cache

Ma, Zhiyao, Gim, In, Zhong, Lin

We present Cacheback Decoding, a training-free and model-agnostic speculative decoding method that exploits the locality in language to accelerate Large Language Model (LLM) inference. Cacheback leverages only Least Recently Used (LRU) cache tables of token n-grams to generate draft sequences. Cacheback achieves state-of-the-art performance among comparable methods despite its minimalist design, and its simplicity allows easy integration into existing systems. Cacheback also shows potential for fast adaptation to new domains.

artificial intelligence, large language model, natural language, (19 more...)

doi: 10.18653/v1/2025.emnlp-main.1581

2511.21699

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)

arXiv.org Artificial IntelligenceNov-26-2025

Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios

Shi, Luohe, Li, Zuchao, Zhang, Lefei, Qi, Baoyuan, Liu, Guoming, Zhao, Hai

Speculative decoding accelerates LLM inference by utilizing otherwise idle computational resources during memory-to-chip data transfer. Current speculative decoding methods typically assume a considerable amount of available computing power, then generate a complex and massive draft tree using a small autoregressive language model to improve overall prediction accuracy. However, methods like batching have been widely applied in mainstream model inference systems as a superior alternative to speculative decoding, as they compress the available idle computing power. Therefore, performing speculative decoding with low verification resources and low scheduling costs has become an important research problem. We believe that more capable models that allow for parallel generation on draft sequences are what we truly need. Recognizing the fundamental nature of draft models to only generate sequences of limited length, we propose SpecFormer, a novel architecture that integrates unidirectional and bidirectional attention mechanisms. SpecFormer combines the au-toregressive model's ability to extract information from the entire input sequence with the parallel generation benefits of non-autoregressive models. This design eliminates the reliance on large prefix trees and achieves consistent acceleration, even in large-batch scenarios. Through lossless speculative decoding experiments across models of various scales, we demonstrate that SpecFormer sets a new standard for scaling LLM inference with lower training demands and reduced computational costs.

information, large language model, machine learning, (19 more...)

2511.2034

Country:

North America > United States (0.68)
Asia > China (0.46)
Europe > Austria > Vienna (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsNov-21-2025, 14:41:38 GMT

Decoding with Value Networks for Neural Machine Translation

Neural Machine Translation (NMT) has become a popular technology in recent years, and beam search is its de facto decoding method due to the shrunk search space and reduced computational complexity. However, since it only searches for local optima at each time step through one-step forward looking, it usually cannot output the best target sentence. Inspired by the success and methodology of AlphaGo, in this paper we propose using a prediction network to improve beam search, which takes the source sentence $x$, the currently available decoding output $y_1,\cdots, y_{t-1}$ and a candidate word $w$ at step $t$ as inputs and predicts the long-term value (e.g., BLEU score) of the partial target sentence if it is completed by the NMT model. Following the practice in reinforcement learning, we call this prediction network \emph{value network}. Specifically, we propose a recurrent structure for the value network, and train its parameters from bilingual data. During the test time, when choosing a word $w$ for decoding, we consider both its conditional probability given by the NMT model and its long-term value predicted by the value network. Experiments show that such an approach can significantly improve the translation accuracy on several translation tasks.

name change, neural machine translation, value network, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Garipov, Roman, Velikonivtsev, Fedor, Ermakov, Ivan, Svirschevski, Ruslan, Egiazarian, Vage, Ryabinin, Max

AutoJudge: Judge Decoding Without Manual Annotation

arXiv.org Artificial IntelligenceNov-21-2025

We introduce AutoJudge, a method that accelerates large language model (LLM) inference with task-specific lossy speculative decoding. Instead of matching the original model output distribution token-by-token, we identify which of the generated tokens affect the downstream quality of the response, relaxing the distribution match guarantee so that the "unimportant" tokens can be generated faster. Our approach relies on a semi-greedy search algorithm to test which of the mismatches between target and draft models should be corrected to preserve quality and which ones may be skipped. We then train a lightweight classifier based on existing LLM embeddings to predict, at inference time, which mismatching tokens can be safely accepted without compromising the final answer quality. We evaluate the effectiveness of AutoJudge with multiple draft/target model pairs on mathematical reasoning and programming benchmarks, achieving significant speedups at the cost of a minor accuracy reduction. Notably, on GSM8k with the Llama 3.1 70B target model, our approach achieves up to $\approx2\times$ speedup over speculative decoding at the cost of $\le 1\%$ drop in accuracy. When applied to the LiveCodeBench benchmark, AutoJudge automatically detects programming-specific important tokens, accepting $\ge 25$ tokens per speculation cycle at $2\%$ drop in Pass@1. Our approach requires no human annotation and is easy to integrate with modern LLM inference frameworks.

large language model, machine learning, natural language, (22 more...)

2504.20039

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Jeong, Woojae, Cui, Wenhui, Avramidis, Kleanthis, Medani, Takfarinas, Narayanan, Shrikanth, Leahy, Richard

Informed Bootstrap Augmentation Improves EEG Decoding

arXiv.org Artificial IntelligenceNov-18-2025

Electroencephalography (EEG) offers detailed access to neural dynamics but remains constrained by noise and trial-by-trial variability, limiting decoding performance in data-restricted or complex paradigms. Data augmentation is often employed to enhance feature representations, yet conventional uniform averaging overlooks differences in trial informativeness and can degrade representational quality. We introduce a weighted bootstrapping approach that prioritizes more reliable trials to generate higher-quality augmented samples. In a Sentence Evaluation paradigm, weights were computed from relative ERP differences and applied during probabilistic sampling and averaging. Across conditions, weighted bootstrapping improved decoding accuracy relative to unweighted (from 68.35% to 71.25% at best), demonstrating that emphasizing reliable trials strengthens representational quality. The results demonstrate that reliability-based augmentation yields more robust and discriminative EEG representations. The code is publicly available at https://github.com/lyricists/NeuroBootstrap.

accuracy, artificial intelligence, machine learning, (18 more...)

2511.12073

Country: North America > United States > California (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.95)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

arXiv.org Artificial IntelligenceNov-14-2025

Language Drift in Multilingual Retrieval-Augmented Generation: Characterization and Decoding-Time Mitigation

Li, Bo, Xu, Zhenghua, Xie, Rui

Multilingual Retrieval-Augmented Generation (RAG) enables large language models (LLMs) to perform knowledge-intensive tasks in multilingual settings by leveraging retrieved documents as external evidence. However, when the retrieved evidence differs in language from the user query and in-context exemplars, the model often exhibits language drift by generating responses in an unintended language. This phenomenon is especially pronounced during reasoning-intensive decoding, such as Chain-of-Thought (CoT) generation, where intermediate steps introduce further language instability. In this paper, we systematically study output language drift in multilingual RAG across multiple datasets, languages, and LLM backbones. Our controlled experiments reveal that the drift results not from comprehension failure but from decoder-level collapse, where dominant token distributions and high-frequency English patterns dominate the intended generation language. We further observe that English serves as a semantic attractor under cross-lingual conditions, emerging as both the strongest interference source and the most frequent fallback language. To mitigate this, we propose Soft Constrained Decoding (SCD), a lightweight, training-free decoding strategy that gently steers generation toward the target language by penalizing non-target-language tokens. SCD is model-agnostic and can be applied to any generation algorithm without modifying the architecture or requiring additional data. Experiments across three multilingual datasets and multiple typologically diverse languages show that SCD consistently improves language alignment and task performance, providing an effective and generalizable solution in multilingual RAG.

large language model, machine learning, natural language, (21 more...)

2511.09984

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)