AITopics | buffer size

Collaborating Authors

buffer size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9b224ace8963c9385ad5e2b5c9039b97-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 22:29:09 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

Variance Reduction Based Experience Replay for Policy Optimization

Zheng, Hua, Xie, Wei, Feng, M. Ben, Choy, Keilung

arXiv.org Machine LearningFeb-6-2026

Effective reinforcement learning (RL) for complex stochastic systems requires leveraging historical data collected in previous iterations to accelerate policy optimization. Classical experience replay treats all past observations uniformly and fails to account for their varying contributions to learning. To overcome this limitation, we propose Variance Reduction Experience Replay (VRER), a principled framework that selectively reuses informative samples to reduce variance in policy gradient estimation. VRER is algorithm-agnostic and integrates seamlessly with existing policy optimization methods, forming the basis of our sample-efficient off-policy algorithm, Policy Gradient with VRER (PG-VRER). Motivated by the lack of rigorous theoretical analysis of experience replay, we develop a novel framework that explicitly captures dependencies introduced by Markovian dynamics and behavior-policy interactions. Using this framework, we establish finite-time convergence guarantees for PG-VRER and reveal a fundamental bias-variance trade-off: reusing older experience increases bias but simultaneously reduces gradient variance. Extensive empirical experiments demonstrate that VRER consistently accelerates policy learning and improves performance over state-of-the-art policy optimization algorithms.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2602.05379

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Same model, better performance: the impact of shuffling on DNA Language Models benchmarking

Greco, Davide, Rawlik, Konrad

arXiv.org Artificial IntelligenceDec-12-2025

Large Language Models are increasingly popular in genomics due to their potential to decode complex biological sequences. Hence, researchers require a standardized benchmark to evaluate DNA Language Models (DNA LMs) capabilities. However, evaluating DNA LMs is a complex task that intersects genomic's domain-specific challenges and machine learning methodologies, where seemingly minor implementation details can significantly compromise benchmark validity. We demonstrate this through BEND (Benchmarking DNA Language Models), where hardware-dependent hyperparameters -- number of data loading workers and buffer sizes -- create spurious performance variations of up to 4% for identical models. The problem stems from inadequate data shuffling interacting with domain specific data characteristics. Experiments with three DNA language models (HyenaDNA, DNABERT-2, ResNet-LM) show these artifacts affect both absolute performance and relative model rankings. We propose a simple solution: pre-shuffling data before storage eliminates hardware dependencies while maintaining efficiency. This work highlights how standard ML practices can interact unexpectedly with domain-specific data characteristics, with broader implications for benchmark design in specialized domains.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.12617

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

Asymptotic analysis of shallow and deep forgetting in replay with Neural Collapse

Lanzillotta, Giulia, Meier, Damiano, Hofmann, Thomas

arXiv.org Artificial IntelligenceDec-9-2025

A persistent paradox in continual learning (CL) is that neural networks often retain linearly separable representations of past tasks even when their output predictions fail. We formalize this distinction as the gap between deep feature-space and shallow classifier-level forgetting. We reveal a critical asymmetry in Experience Replay: while minimal buffers successfully anchor feature geometry and prevent deep forgetting, mitigating shallow forgetting typically requires substantially larger buffer capacities. To explain this, we extend the Neural Collapse framework to the sequential setting. We characterize deep forgetting as a geometric drift toward out-of-distribution subspaces and prove that any non-zero replay fraction asymptotically guarantees the retention of linear separability. Conversely, we identify that the "strong collapse" induced by small buffers leads to rank-deficient covariances and inflated class means, effectively blinding the classifier to true population boundaries. By unifying CL with out-of-distribution detection, our work challenges the prevailing reliance on large buffers, suggesting that explicitly correcting these statistical artifacts could unlock robust performance with minimal replay.

artificial intelligence, machine learning, neural collapse, (17 more...)

arXiv.org Artificial Intelligence

2512.074

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

Alssum, Lama, Hammoud, Hasan Abed Al Kader, Alfarra, Motasem, Alcazar, Juan C Leon, Ghanem, Bernard

arXiv.org Artificial IntelligenceDec-2-2025

Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new information. We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches. We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods, which is based exclusively on the expected label distribution, thus making it class-agnostic. As a consequence, IM regularizer can be directly integrated into various rehearsal-based continual learning methods, reducing forgetting and favoring faster convergence. Our empirical validation shows that, across datasets and regardless of the number of tasks, our proposed regularization strategy consistently improves baseline performance at the expense of a minimal computational overhead. The lightweight nature of IM ensures that it remains a practical and scalable solution, making it applicable to real-world continual learning scenarios where efficiency is paramount. Finally, we demonstrate the data-agnostic nature of our regularizer by applying it to video data, which presents additional challenges due to its temporal structure and higher memory requirements. Despite the significant domain gap, our experiments show that IM regularizer also improves the performance of video continual learning methods.

artificial intelligence, continual learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.01818

Genre:

Research Report > New Finding (0.47)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

SuRe: Surprise-Driven Prioritised Replay for Continual LLM Learning

Hazard, Hugo, Fountas, Zafeirios, Benfeghoul, Martin A., Oomerjee, Adnan, Wang, Jun, Bou-Ammar, Haitham

arXiv.org Artificial IntelligenceDec-1-2025

Continual learning, one's ability to adapt to a sequence of tasks without forgetting previously acquired knowledge, remains a major challenge in machine learning and a key gap between artificial and human intelligence. While regularisation and replay perform well in vision, they lag behind multi-task learning for large language models (LLMs), especially at scale with many tasks. We revisit replay and argue that two failure modes drive this gap: selection (what to rehearse) and integration (how to consolidate new knowledge). To address selection, we propose Surprise-prioritised Replay (SuRe), a simple, architecture-agnostic rule that ranks and stores the most surprising (high Negative Log-Likelihood) sequences. SuRe achieves state-of-the-art performance in the Large Number of Tasks (LNT) setting and delivers the best overall average across both Standard CL and LNT benchmarks. To address integration, we add a dual-learner design with fast and slow LoRA adapters merged via an exponential moving average (EMA), enabling rapid adaptation while stabilising long-term knowledge. Combining SuRe with the dual learner yields further gains, including improvements of up to +5 accuracy points on LNT over prior SOTA. Ablation studies confirm that our proposed method remains robust under reduced replay frequency and small buffer size, demonstrating both effectiveness and sample efficiency. Taken together, our results establish replay as a strong baseline for continual LLM fine-tuning and demonstrate that surprise-based selection and slow-weight consolidation are complementary components for mitigating catastrophic forgetting.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.22367

Genre: Research Report > New Finding (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Uncertainty Guided Online Ensemble for Non-stationary Data Streams in Fusion Science

Rajput, Kishansingh, Schram, Malachi, Sammuli, Brian, Lin, Sen

arXiv.org Artificial IntelligenceNov-5-2025

Machine Learning (ML) is poised to play a pivotal role in the development and operation of next-generation fusion devices. Fusion data shows non-stationary behavior with distribution drifts, resulted by both experimental evolution and machine wear-and-tear. ML models assume stationary distribution and fail to maintain performance when encountered with such non-stationary data streams. Online learning techniques have been leveraged in other domains, however it has been largely unexplored for fusion applications. In this paper, we present an application of online learning to continuously adapt to drifting data stream for prediction of Toroidal Field (TF) coils deflection at the DIII-D fusion facility. The results demonstrate that online learning is critical to maintain ML model performance and reduces error by 80% compared to a static model. Moreover, traditional online learning can suffer from short-term performance degradation as ground truth is not available before making the predictions. As such, we propose an uncertainty guided online ensemble method to further improve the performance. The Deep Gaussian Process Approximation (DGPA) technique is leveraged for calibrated uncertainty estimation and the uncertainty values are then used to guide a meta-algorithm that produces predictions based on an ensemble of learners trained on different horizon of historical data. The DGPA also provides uncertainty estimation along with the predictions for decision makers. The online ensemble and the proposed uncertainty guided online ensemble reduces predictions error by about 6%, and 10% respectively over standard single model based online learning.

artificial intelligence, machine learning, prediction, (15 more...)

arXiv.org Artificial Intelligence

2511.02092

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Government > Regional Government > North America Government > United States Government (0.70)
Energy (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

9b224ace8963c9385ad5e2b5c9039b97-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 11:01:03 GMT

accuracy, dataset, frequency domain feature, (15 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

Efficient AllReduce with Stragglers

Devraj, Arjun, Ding, Eric, Kumar, Abhishek Vijaya, Kleinberg, Robert, Singh, Rachee

arXiv.org Artificial IntelligenceSep-30-2025

Distributed machine learning workloads use data and tensor parallelism for training and inference, both of which rely on the AllReduce collective to synchronize gradients or activations. However, AllReduce algorithms are delayed by the slowest GPU to reach the synchronization barrier before the collective (i.e., the straggler). To address this challenge, we propose StragglAR: a parallel algorithm for AllReduce that accelerates distributed training and inference by exploiting natural variation in GPU execution times. StragglAR implements a ReduceScatter among the remaining GPUs during the straggler-induced delay, and then executes a novel collective algorithm to complete the AllReduce once the final GPU reaches the synchronization barrier. StragglAR achieves a 2x theoretical speedup over popular bandwidth-efficient algorithms for large GPU clusters, surpassing the lower bound for bandwidth-optimal synchronous AllReduce by leveraging the asymmetry in when GPUs reach the synchronization barrier. On an 8-GPU server, StragglAR provides a 25% speedup over state-of-the-art AllReduce algorithms.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.23523

Country:

Europe (0.92)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry: Information Technology (0.69)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Efficient Reinforcement Learning by Reducing Forgetting with Elephant Activation Functions

Lan, Qingfeng, Vasan, Gautham, Mahmood, A. Rupam

arXiv.org Artificial IntelligenceSep-24-2025

Catastrophic forgetting has remained a significant challenge for efficient reinforcement learning for decades (Ring 1994, Rivest and Precup 2003). While recent works have proposed effective methods to mitigate this issue, they mainly focus on the algorithmic side. Meanwhile, we do not fully understand what architectural properties of neural networks lead to catastrophic forgetting. This study aims to fill this gap by studying the role of activation functions in the training dynamics of neural networks and their impact on catastrophic forgetting in reinforcement learning setup. Our study reveals that, besides sparse representations, the gradient sparsity of activation functions also plays an important role in reducing forgetting. Based on this insight, we propose a new class of activation functions, elephant activation functions, that can generate both sparse outputs and sparse gradients. We show that by simply replacing classical activation functions with elephant activation functions in the neural networks of value-based algorithms, we can significantly improve the resilience of neural networks to catastrophic forgetting, thus making reinforcement learning more sample-efficient and memory-efficient.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.19159

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback