AITopics | staleness

Collaborating Authors

staleness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sageflow: Robust Federated Learning against Both Stragglers and Adversaries

Neural Information Processing SystemsApr-24-2026, 12:48:31 GMT

While federated learning (FL) allows efficient model training with local data at edge devices, among major issues still to be resolved are: slow devices known as stragglers and malicious attacks launched by adversaries. While the presence of both of these issues raises serious concerns in practical FL systems, no known schemes or combinations of schemes effectively address them at the same time. We propose Sageflow, staleness-aware grouping with entropy-based filtering and loss-weighted averaging, to handle both stragglers and adversaries simultaneously. Model grouping and weighting according to staleness (arrival delay) provides robustness against stragglers, while entropy-based filtering and loss-weighted averaging, working in a highly complementary fashion at each grouping stage, counter a wide range of adversary attacks. A theoretical bound is established to provide key insights into the convergence behavior of Sageflow. Extensive experimental results show that Sageflow outperforms various existing methods aiming to handle stragglers/adversaries.

adversary, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Delightful Distributed Policy Gradient

Osband, Ian

arXiv.org Machine LearningMar-24-2026

Distributed reinforcement learning trains on data from stale, buggy, or mismatched actors, producing actions with high surprisal (negative log-probability) under the learner's policy. The core difficulty is not surprising data per se, but \emph{negative learning from surprising data}. High-surprisal failures can dominate the update direction despite carrying little useful signal, while high-surprisal successes reveal opportunities the current policy would otherwise miss. The \textit{Delightful Policy Gradient} (DG) separates these cases by gating each update with delight, the product of advantage and surprisal, suppressing rare failures and amplifying rare successes without behavior probabilities. Under contaminated sampling, the cosine similarity between the standard policy gradient and the true gradient collapses, while DG's grows as the policy improves. No sign-blind reweighting, including exact importance sampling, can reproduce this effect. On MNIST with simulated staleness, DG without off-policy correction outperforms importance-weighted PG with exact behavior probabilities. On a transformer sequence task with staleness, actor bugs, reward corruption, and rare discovery, DG achieves roughly $10{\times}$ lower error. When all four frictions act simultaneously, its compute advantage is order-of-magnitude and grows with task complexity.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2603.20521

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

Add feedback

RecommendationModels

Neural Information Processing SystemsFeb-11-2026, 16:32:29 GMT

Although synchronous AR training is designed to have higher training efficiency,asynchronous PStraining would beabetter choice for training speed when there are stragglers (slow workers) in the shared cluster, especially under limited computing resources.

artificial intelligence, machine learning, staleness, (17 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.05)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SupplementaryMaterial

Neural Information Processing SystemsFeb-11-2026, 10:15:46 GMT

R(h). (23) Here for simplicity, we abused the symbolD in(22)by maximizing outh0 in the originalD. In the top-left areaP,suppose only oneexample (markedbyxwith vertical coordinate1)isconfidently labeled as positive, and the rest examples are highly inconfidently labeled, hence not to contribute to the riskR. Similarly,there isonly one confidently labeled example ()inthe bottom-right area ofP, and it is negative with vertical coordinate 1. Wheneverλ > 2, the optimalhλ is in(0,1)and can be solved by a quadratic equation. In contrast,di-MDD is immune to this problem becauseRis used only to determineh, while the di-MDD value itself is solely contributed byD. Same as the scenario of largeλ, we do not change the feature distribution of source and target domains, hence keepingD(h) = 1 |h|.

artificial intelligence, machine learning, supplementarymaterial, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Large Graph Property Prediction via Graph Segment Training

Neural Information Processing SystemsFeb-11-2026, 08:57:25 GMT

Learning to predict properties of a large graph is challenging because each prediction requires the knowledge of an entire graph, while the amount of memory available during training is bounded.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

725ce5f2b1a8e2e0ac66994e7fefe375-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 18:50:34 GMT

gradient, sapipe, staleness, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Workflow (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

SAPipe: Staleness-AwarePipelineforData-Parallel DNNTraining

Neural Information Processing SystemsFeb-9-2026, 18:50:30 GMT

In this paper, we propose SAPipe, a performant system that pushes the training speed ofdata parallelism toitsfullest extent.

machine learning, natural language, sapipe, (21 more...)

Neural Information Processing Systems

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Fu, Wei, Gao, Jiaxuan, Shen, Xujie, Zhu, Chen, Mei, Zhiyu, He, Chuyi, Xu, Shusheng, Wei, Guo, Mei, Jun, Wang, Jiashu, Yang, Tongkai, Yuan, Binhang, Wu, Yi

arXiv.org Artificial IntelligenceNov-26-2025

Reinforcement learning (RL) has become a dominant paradigm for training large language models (LLMs), particularly for reasoning tasks. Effective RL for LLMs requires massive parallelization and poses an urgent need for efficient training systems. Most existing large-scale RL systems for LLMs are synchronous, alternating generation and training in a batch setting where rollouts in each training batch are generated by the same model. This approach stabilizes RL training but suffers from severe system-level inefficiency: generation must wait until the longest output in the batch is completed before model updates, resulting in GPU underutilization. We present AReaL, a fully asynchronous RL system that completely decouples generation from training. Rollout workers in AReaL continuously generate new outputs without waiting, while training workers update the model whenever a batch of data is collected. AReaL also incorporates a collection of system-level optimizations, leading to substantially higher GPU utilization. To stabilize RL training, AReaL balances the workload of rollout and training workers to control data staleness, and adopts a staleness-enhanced PPO variant to better handle outdated training samples. Extensive experiments on math and code reasoning benchmarks show that AReaL achieves up to 2.77$\times$ training speedup compared to synchronous systems with the same number of GPUs and matched or improved final performance. The code of AReaL is available at https://github.com/inclusionAI/AReaL/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.24298

Country:

North America > United States (0.68)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models

Wang, Yujia, Cao, Yuanpu, Chen, Jinghui

arXiv.org Artificial IntelligenceNov-26-2025

Federated learning (FL) has been extensively studied as a privacy-preserving training paradigm. Recently, federated block coordinate descent scheme has become a popular option in training large-scale models, as it allows clients to train only a subset of the model locally instead of the entire model. However, in the era of large language models (LLMs), even a single block can contain a significant number of parameters, posing substantial communication latency, particularly for resource-constrained clients. To address this challenge in federated training/fine-tuning LLMs, we propose ParaBlock, a novel approach that establishes two parallel threads for communication and computation to enhance communication efficiency. We theoretically prove that the proposed ParaBlock achieves the same convergence rate as the standard federated block coordinate descent methods. Empirical evaluations on fine-tuning LLMs on general instruction following and mathematical reasoning confirm that ParaBlock not only maintains strong performance but also significantly improves communication efficiency.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.19959

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Filters

Collaborating Authors

staleness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Sageflow: Robust Federated Learning against Both Stragglers and Adversaries

Delightful Distributed Policy Gradient

RecommendationModels

SupplementaryMaterial

Large Graph Property Prediction via Graph Segment Training

725ce5f2b1a8e2e0ac66994e7fefe375-Supplemental-Conference.pdf

SAPipe: Staleness-AwarePipelineforData-Parallel DNNTraining

076a8133735eb5d7552dc195b125a454-Paper.pdf

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language Models