AITopics | viability kernel

Collaborating Authors

viability kernel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning

Li, Jingqi, Lee, Donggun, Sojoudi, Somayeh, Tomlin, Claire J.

arXiv.org Artificial IntelligenceSep-18-2024

In this paper, we consider the infinite-horizon reach-avoid zero-sum game problem, where the goal is to find a set in the state space, referred to as the reach-avoid set, such that the system starting at a state therein could be controlled to reach a given target set without violating constraints under the worst-case disturbance. We address this problem by designing a new value function with a contracting Bellman backup, where the super-zero level set, i.e., the set of states where the value function is evaluated to be non-negative, recovers the reach-avoid set. Building upon this, we prove that the proposed method can be adapted to compute the viability kernel, or the set of states which could be controlled to satisfy given constraints, and the backward reachable set, or the set of states that could be driven towards a given target set. Finally, we propose to alleviate the curse of dimensionality issue in high-dimensional problems by extending Conservative Q-Learning, a deep reinforcement learning technique, to learn a value function such that the super-zero level set of the learned value function serves as a (conservative) approximation to the reach-avoid set. Our theoretical and empirical results suggest that the proposed method could learn reliably the reach-avoid set and the optimal control policy even with neural network approximation.

constraint, value function, viability kernel, (14 more...)

arXiv.org Artificial Intelligence

2203.10142

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Do No Harm: A Counterfactual Approach to Safe Reinforcement Learning

Vaskov, Sean, Schwarting, Wilko, Baker, Chris L.

arXiv.org Artificial IntelligenceMay-19-2024

Reinforcement Learning (RL) for control has become increasingly popular due to its ability to learn rich feedback policies that take into account uncertainty and complex representations of the environment. When considering safety constraints, constrained optimization approaches, where agents are penalized for constraint violations, are commonly used. In such methods, if agents are initialized in, or must visit, states where constraint violation might be inevitable, it is unclear how much they should be penalized. We address this challenge by formulating a constraint on the counterfactual harm of the learned policy compared to a default, safe policy. In a philosophical sense this formulation only penalizes the learner for constraint violations that it caused; in a practical sense it maintains feasibility of the optimal control problem. We present simulation studies on a rover with uncertain road friction and a tractor-trailer parking environment that demonstrate our constraint formulation enables agents to learn safer policies than contemporary constrained RL methods.

constraint violation, default policy, viability kernel, (12 more...)

arXiv.org Artificial Intelligence

2405.11669

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

VBOC: Learning the Viability Boundary of a Robot Manipulator using Optimal Control

La Rocca, Asia, Saveriano, Matteo, Del Prete, Andrea

arXiv.org Artificial IntelligenceSep-11-2023

Safety is often the most important requirement in robotics applications. Nonetheless, control techniques that can provide safety guarantees are still extremely rare for nonlinear systems, such as robot manipulators. A well-known tool to ensure safety is the Viability kernel, which is the largest set of states from which safety can be ensured. Unfortunately, computing such a set for a nonlinear system is extremely challenging in general. Several numerical algorithms for approximating it have been proposed in the literature, but they suffer from the curse of dimensionality. This paper presents a new approach for numerically approximating the viability kernel of robot manipulators. Our approach solves optimal control problems to compute states that are guaranteed to be on the boundary of the set. This allows us to learn directly the set boundary, therefore learning in a smaller dimensional space. Compared to the state of the art on systems up to dimension 6, our algorithm resulted to be more than 2 times as accurate for the same computation time, or 6 times as fast to reach the same accuracy.

algorithm, trajectory, vboc, (15 more...)

arXiv.org Artificial Intelligence

2305.07535

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Patching Neural Barrier Functions Using Hamilton-Jacobi Reachability

Tonkens, Sander, Toofanian, Alex, Qin, Zhizhen, Gao, Sicun, Herbert, Sylvia

arXiv.org Artificial IntelligenceApr-19-2023

Learning-based control algorithms have led to major advances in robotics at the cost of decreased safety guarantees. Recently, neural networks have also been used to characterize safety through the use of barrier functions for complex nonlinear systems. Learned barrier functions approximately encode and enforce a desired safety constraint through a value function, but do not provide any formal guarantees. In this paper, we propose a local dynamic programming (DP) based approach to "patch" an almost-safe learned barrier at potentially unsafe points in the state space. This algorithm, HJ-Patch, obtains a novel barrier that provides formal safety guarantees, yet retains the global structure of the learned barrier. Our local DP based reachability algorithm, HJ-Patch, updates the barrier function "minimally" at points that both (a) neighbor the barrier safety boundary and (b) do not satisfy the safety condition. We view this as a key step to bridging the gap between learning-based barrier functions and Hamilton-Jacobi reachability analysis, providing a framework for further integration of these approaches. We demonstrate that for well-trained barriers we reduce the computational load by 2 orders of magnitude with respect to standard DP-based reachability, and demonstrate scalability to a 6-dimensional system, which is at the limit of standard DP-based reachability.

artificial intelligence, machine learning, value function, (16 more...)

arXiv.org Artificial Intelligence

2304.0985

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Safe Value Functions

Massiani, Pierre-François, Heim, Steve, Solowjow, Friedrich, Trimpe, Sebastian

arXiv.org Artificial IntelligenceDec-1-2022

Safety constraints and optimality are important, but sometimes conflicting criteria for controllers. Although these criteria are often solved separately with different tools to maintain formal guarantees, it is also common practice in reinforcement learning to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine the relationship of both safety and optimality to penalties, and formalize sufficient conditions for safe value functions (SVFs): value functions that are both optimal for a given task, and enforce safety constraints. We reveal this structure by examining when rewards preserve viability under optimal control, and show that there always exists a finite penalty that induces a safe value function. This penalty is not unique, but upper-unbounded: larger penalties do not harm optimality. Although it is often not possible to compute the minimum required penalty, we reveal clear structure of how the penalty, rewards, discount factor, and dynamics interact. This insight suggests practical, theory-guided heuristics to design reward functions for control problems where safety is important.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TAC.2022.3200948

2105.12204

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Refining Control Barrier Functions through Hamilton-Jacobi Reachability

Tonkens, Sander, Herbert, Sylvia

arXiv.org Artificial IntelligenceAug-19-2022

Safety filters based on Control Barrier Functions (CBFs) have emerged as a practical tool for the safety-critical control of autonomous systems. These approaches encode safety through a value function and enforce safety by imposing a constraint on the time derivative of this value function. However, synthesizing a valid CBF that is not overly conservative in the presence of input constraints is a notorious challenge. In this work, we propose refining a candidate CBF using formal verification methods to obtain a valid CBF. In particular, we update an expert-synthesized or backup CBF using dynamic programming (DP) based reachability analysis. Our framework, refineCBF, guarantees that with every DP iteration the obtained CBF is provably at least as safe as the prior iteration and converges to a valid CBF. Therefore, refineCBF can be used in-the-loop for robotic systems. We demonstrate the practicality of our method to enhance safety and/or reduce conservativeness on a range of nonlinear control-affine systems using various CBF synthesis techniques in simulation.

cbf, constraint, refine cbf, (15 more...)

arXiv.org Artificial Intelligence

2204.12507

Country:

North America > United States > Indiana (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.68)

Add feedback

A Learnable Safety Measure

Heim, Steve, von Rohr, Alexander, Trimpe, Sebastian, Badri-Spröwitz, Alexander

arXiv.org Artificial IntelligenceOct-7-2019

Failures are challenging for learning to control physical systems since they risk damage, time-consuming resets, and often provide little gradient information. Adding safety constraints to exploration typically requires a lot of prior knowledge and domain expertise. We present a safety measure which implicitly captures how the system dynamics relate to a set of failure states. Not only can this measure be used as a safety function, but also to directly compute the set of safe state-action pairs. Further, we show a model-free approach to learn this measure by active sampling using Gaussian processes. While safety can only be guaranteed after learning the safety measure, we show that failures can already be greatly reduced by using the estimated measure during learning.

safety measure, state-action pair, state-action space, (15 more...)

arXiv.org Artificial Intelligence

1910.02835

Country:

Europe > Germany (0.04)
North America > United States (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fall with Grace

Heim, Steve, Spröwitz, Alexander

arXiv.org Artificial IntelligenceJun-18-2018

Despite impressive results using reinforcement learning to solve complex problems from scratch, in robotics this has still been largely limited to model-based learning with very informative reward functions. One of the major challenges is that the reward landscape often has large patches with no gradient, making it difficult to sample gradients effectively. We show here that the robot state-initialization can have a more important effect on the reward landscape than is generally expected. In particular, we show the counter-intuitive benefit of including initializations that are unviable, in other words initializing in states that are doomed to fail.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SIMPAR.2018.8376271

1806.06569

Country: Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.47)

Add feedback