AITopics | performance loss

Collaborating Authors

performance loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Safe Policy Improvement by Minimizing Robust Baseline Regret

Mohammad Ghavamzadeh, Marek Petrik, Yinlam Chow

Neural Information Processing SystemsMay-1-2026, 05:55:55 GMT

An important problem in sequential decision-making under uncertainty is to use limited data to compute a safe policy, which is guaranteed to outperform a given baseline strategy. In this paper, we develop and analyze a new model-based approach that computes a safe policy, given an inaccurate model of the system's dynamics and guarantees on the accuracy of this model. The new robust method uses this model to directly minimize the (negative) regret w.r.t. the baseline policy. Contrary to existing approaches, minimizing the regret allows one to improve the baseline policy in states with accurate dynamics and to seamlessly fall back to the baseline policy, otherwise. We show that our formulation is NP-hard and propose a simple approximate algorithm. Our empirical results on several domains further show that even the simple approximate algorithm can outperform standard approaches.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

a64bd53139f71961c5c31a9af03d775e-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 16:46:39 GMT

classifier, dataset, experiment, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada (0.04)
Europe (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
(3 more...)

Add feedback

Representative Action Selection for Large Action Space: From Bandits to MDPs

Zhou, Quan, Mannor, Shie

arXiv.org Machine LearningDec-1-2025

We study the problem of selecting a small, representative action subset from an extremely large action space shared across a family of reinforcement learning (RL) environments -- a fundamental challenge in applications like inventory management and recommendation systems, where direct learning over the entire space is intractable. Our goal is to identify a fixed subset of actions that, for every environment in the family, contains a near-optimal action, thereby enabling efficient learning without exhaustively evaluating all actions. This work extends our prior results for meta-bandits to the more general setting of Markov Decision Processes (MDPs). We prove that our existing algorithm achieves performance comparable to using the full action space. This theoretical guarantee is established under a relaxed, non-centered sub-Gaussian process model, which accommodates greater environmental heterogeneity. Consequently, our approach provides a computationally and sample-efficient solution for large-scale combinatorial decision-making under uncertainty.

action space, equation, mdp, (15 more...)

arXiv.org Machine Learning

2511.22104

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

Learned Structure in Cartridges: Keys as Shareable Routers in Self-Studied Representations

Diaz, Maurizio

arXiv.org Artificial IntelligenceNov-11-2025

A bottleneck for long-context LLM inference is the linearly growing KV cache. Recent work has proposed Cartridges, an approach which leverages offline compute to train a much smaller KV cache than is typically required for a full document (up to 40x less memory usage at inference time). In this paper, we present the first mechanistic exploration of the learned Cartridge key-value cache structure. In particular, we propose that (1) Cartridge keys act as stable, shareable retrieval routers for the compressed corpora and (2) most of the learned compression occurs within the Cartridge value vectors. We present empirical evidence of our routing theory across tasks, model families, and model sizes; for example, we can ablate the learned Cartridge key vectors between tasks with little performance loss. Finally, we propose a slight improvement in initialization called Sampled Chunk Initialization (SCI). We suggest that SCI can lead to faster Cartridge convergence than previously demonstrated in the literature. Our findings lay the groundwork for broader empirical study of Cartridge training optimization which may be crucial for further scaling.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.17032

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.84)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)

Add feedback

Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+

Ng, York Hay, Khan, Aditya, Lu, Xiang, Salloum, Matteo, Zhou, Michael, Hoang, Phuong H., Doğruöz, A. Seza, Lee, En-Shiun Annie

arXiv.org Artificial IntelligenceOct-23-2025

Existing linguistic knowledge bases such as URIEL+ provide valuable geographic, genetic and typological distances for cross-lingual transfer but suffer from two key limitations. One, their one-size-fits-all vector representations are ill-suited to the diverse structures of linguistic data, and two, they lack a principled method for aggregating these signals into a single, comprehensive score. In this paper, we address these gaps by introducing a framework for type-matched language distances. We propose novel, structure-aware representations for each distance type: speaker-weighted distributions for geography, hyperbolic embeddings for genealogy, and a latent variables model for typology. We unify these signals into a robust, task-agnostic composite distance. In selecting transfer languages, our representations and composite distances consistently improve performance across a wide range of NLP tasks, providing a more principled and effective toolkit for multilingual research.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.19217

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)

Add feedback

Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning

Pedley, James, Etheridge, Benjamin, Roberts, Stephen J., Quinzan, Francesco

arXiv.org Artificial IntelligenceOct-16-2025

Reinforcement learning (RL) policies deployed in real-world environments must remain reliable under adversarial perturbations. At the same time, modern deep RL agents are heavily over-parameterized, raising costs and fragility concerns. While pruning has been shown to improve robustness in supervised learning, its role in adversarial RL remains poorly understood. We develop the first theoretical framework for certified robustness under pruning in state-adversarial Markov decision processes (SA-MDPs). For Gaussian and categorical policies with Lipschitz networks, we prove that element-wise pruning can only tighten certified robustness bounds; pruning never makes the policy less robust. Building on this, we derive a novel three-term regret decomposition that disentangles clean-task performance, pruning-induced performance loss, and robustness gains, exposing a fundamental performance--robustness frontier. Empirically, we evaluate magnitude and micro-pruning schedules on continuous-control benchmarks with strong policy-aware adversaries. Across tasks, pruning consistently uncovers reproducible ``sweet spots'' at moderate sparsity levels, where robustness improves substantially without harming - and sometimes even enhancing - clean performance. These results position pruning not merely as a compression tool but as a structural intervention for robust RL.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2510.12939

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

On the Occurence of Critical Learning Periods in Neural Networks

Pawlak, Stanisław

arXiv.org Artificial IntelligenceOct-14-2025

This study delves into the plasticity of neural networks, offering empirical support for the notion that critical learning periods and warm-starting performance loss can be avoided through simple adjustments to learning hyperparameters. The critical learning phenomenon emerges when training is initiated with deficit data. Subsequently, after numerous deficit epochs, the network's plasticity wanes, impeding its capacity to achieve parity in accuracy with models trained from scratch, even when extensive clean data training follows deficit epochs. Building upon seminal research introducing critical learning periods, we replicate key findings and broaden the experimental scope of the main experiment from the original work. In addition, we consider a warm-starting approach and show that it can be seen as a form of deficit pretraining. In particular, we demonstrate that these problems can be averted by employing a cyclic learning rate schedule. Our findings not only impact neural network training practices but also establish a vital link between critical learning periods and ongoing research on warm-starting neural network training.

artificial intelligence, corruption, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.09687

Country: North America (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education > Instructional Theory (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 17:57:10 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper presents an interesting idea of using a robust formulation to fit a value function given aggregate states. The robust formulation leads to a much stronger error bound than what is achieved by regular approximate value iteration. The key is in using the algorithm to effectively select weights applied to states within each aggregate. On the negative side, the authors do not do a good job at presenting their notation and algorithm in a clear way, and the computational results are difficult to understand.

aggregation, algorithm, approximation error, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence (0.73)

Add feedback