AITopics | fallback policy

Collaborating Authors

fallback policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Risk-Controlled Post-Processing of Decision Policies

Joshi, Sunay, Wang, Tao, Hassani, Hamed, Dobriban, Edgar

arXiv.org Machine LearningMay-8-2026

Predictive models are often deployed through existing decision policies that stakeholders are reluctant to change unless a risk constraint requires intervention. We study risk-controlled post-processing: given a deterministic baseline policy, choose a new policy that maximizes agreement with the baseline subject to a chance constraint on a user-specified loss. At the population level, we show that the optimal policy has a threshold structure: it follows the baseline except on contexts where switching to the oracle fallback policy yields a large reduction in conditional violation risk. At the finite-sample level, given a fitted fallback policy and score, we develop a post-processing algorithm that uses calibration data to select a threshold. Leveraging tools from algorithmic stability and stochastic processes, we show that under regularity conditions, in the i.i.d. setting, the expected excess risk of the post-processed policy is $O(\log n/n)$. In the special case when an exact-safe fallback policy is available, the algorithm achieves precise expected risk control under exchangeability. In this setting, we also give high-probability near-optimality guarantees on the post-processed policy. Experiments on a COVID-19 radiograph diagnosis task, an LLM routing problem, and a synthetic multiclass decision task show that targeted post-processing can meet or nearly meet risk budgets while preserving substantially more agreement with the baseline than score-blind random mixing.

artificial intelligence, machine learning, threshold, (18 more...)

arXiv.org Machine Learning

2605.06479

Country: North America > United States > Pennsylvania (0.40)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

7b97adeafa1c51cf65263459ca9d0d7c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 09:44:21 GMT

large language model, machine learning, small model, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Speculative Decoding with Big Little Decoder

Neural Information Processing SystemsOct-8-2025, 23:12:30 GMT

Figure 1: Illustration of (Left) the normal autoregressive decoding procedure of a large model and (Right) BiLD that consists of a small model and a large model. In BiLD, the small model generates tokens autoregressively (i.e., sequentially) until it hands over control to the large model.

large language model, machine learning, small model, (19 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

A universal policy wrapper with guarantees

Bolychev, Anton, Malaniya, Georgiy, Yaremenko, Grigory, Krasnaya, Anastasia, Osinenko, Pavel

arXiv.org Artificial IntelligenceMay-20-2025

-- We introduce a universal policy wrapper for reinforcement learning agents that ensures formal goal-reaching guarantees. In contrast to standard reinforcement learning algorithms that excel in performance but lack rigorous safety assurances, our wrapper selectively switches between a high-performing base policy -- derived from any existing RL method -- and a fallback policy with known convergence properties. Base policy's value function supervises this switching process, determining when the fallback policy should override the base policy to ensure the system remains on a stable path. The analysis proves that our wrapper inherits the fallback policy's goal-reaching guarantees while preserving or improving upon the performance of the base policy. Notably, it operates without needing additional system knowledge or online constrained optimization, making it readily deployable across diverse reinforcement learning architectures and tasks.

fallback policy, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2505.12354

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Safety Filter: A Unified View of Safety-Critical Control in Autonomous Systems

Hsu, Kai-Chieh, Hu, Haimin, Fisac, Jaime Fernández

arXiv.org Artificial IntelligenceSep-11-2023

Recent years have seen significant progress in the realm of robot autonomy, accompanied by the expanding reach of robotic technologies. However, the emergence of new deployment domains brings unprecedented challenges in ensuring safe operation of these systems, which remains as crucial as ever. While traditional model-based safe control methods struggle with generalizability and scalability, emerging data-driven approaches tend to lack well-understood guarantees, which can result in unpredictable catastrophic failures. Successful deployment of the next generation of autonomous robots will require integrating the strengths of both paradigms. This article provides a review of safety filter approaches, highlighting important connections between existing techniques and proposing a unified technical framework to understand, compare, and combine them. The new unified view exposes a shared modular structure across a range of seemingly disparate safety filter classes and naturally suggests directions for future progress towards more scalable synthesis, robust monitoring, and efficient intervention.

ieee conf, robot, safety filter, (11 more...)

arXiv.org Artificial Intelligence

2309.05837

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(9 more...)

Genre: Overview (1.00)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback