Goto

Collaborating Authors

 safety index


Safe Control of Quadruped in Varying Dynamics via Safety Index Adaptation

Yun, Kai S., Chen, Rui, Dunaway, Chase, Dolan, John M., Liu, Changliu

arXiv.org Artificial Intelligence

Varying dynamics pose a fundamental difficulty when deploying safe control laws in the real world. Safety Index Synthesis (SIS) deeply relies on the system dynamics and once the dynamics change, the previously synthesized safety index becomes invalid. In this work, we show the real-time efficacy of Safety Index Adaptation (SIA) in varying dynamics. SIA enables real-time adaptation to the changing dynamics so that the adapted safe control law can still guarantee 1) forward invariance within a safe region and 2) finite time convergence to that safe region. This work employs SIA on a package-carrying quadruped robot, where the payload weight changes in real-time. SIA updates the safety index when the dynamics change, e.g., a change in payload weight, so that the quadruped can avoid obstacles while achieving its performance objectives. Numerical study provides theoretical guarantees for SIA and a series of hardware experiments demonstrate the effectiveness of SIA in real-world deployment in avoiding obstacles under varying dynamics.


Real-Time Safe Control of Neural Network Dynamic Models with Sound Approximation

Hu, Hanjiang, Lan, Jianglin, Liu, Changliu

arXiv.org Artificial Intelligence

Safe control of neural network dynamic models (NNDMs) is important to robotics and many applications. However, it remains challenging to compute an optimal safe control in real time for NNDM. To enable real-time computation, we propose to use a sound approximation of the NNDM in the control synthesis. In particular, we propose Bernstein over-approximated neural dynamics (BOND) based on the Bernstein polynomial over-approximation (BPO) of ReLU activation functions in NNDM. To mitigate the errors introduced by the approximation and to ensure persistent feasibility of the safe control problems, we synthesize a worst-case safety index using the most unsafe approximated state within the BPO relaxation of NNDM offline. For the online real-time optimization, we formulate the first-order Taylor approximation of the nonlinear worst-case safety constraint as an additional linear layer of NNDM with the l2 bounded bias term for the higher-order remainder. Comprehensive experiments with different neural dynamics and safety constraints show that with safety guaranteed, our NNDMs with sound approximation are 10-100 times faster than the safe control baseline that uses mixed integer programming (MIP), validating the effectiveness of the worst-case safety index and scalability of the proposed BOND in real-time large-scale settings. The code is available at https://github.com/intelligent-control-lab/BOND.


Robust Safe Control with Multi-Modal Uncertainty

Wei, Tianhao, Ma, Liqian, Pandya, Ravi, Liu, Changliu

arXiv.org Artificial Intelligence

Safety in dynamic systems with prevalent uncertainties is crucial. Current robust safe controllers, designed primarily for uni-modal uncertainties, may be either overly conservative or unsafe when handling multi-modal uncertainties. To address the problem, we introduce a novel framework for robust safe control, tailored to accommodate multi-modal Gaussian dynamics uncertainties and control limits. We first present an innovative method for deriving the least conservative robust safe control under additive multi-modal uncertainties. Next, we propose a strategy to identify a locally least-conservative robust safe control under multiplicative uncertainties. Following these, we introduce a unique safety index synthesis method. This provides the foundation for a robust safe controller that ensures a high probability of realizability under control limits and multi-modal uncertainties. Experiments on a simulated Segway validate our approach, showing consistent realizability and less conservatism than controllers designed using uni-modal uncertainty methods. The framework offers significant potential for enhancing safety and performance in robotic applications.


Zero-shot Transferable and Persistently Feasible Safe Control for High Dimensional Systems by Consistent Abstraction

Wei, Tianhao, Kang, Shucheng, Liu, Ruixuan, Liu, Changliu

arXiv.org Artificial Intelligence

Safety is critical in robotic tasks. Energy function based methods have been introduced to address the problem. To ensure safety in the presence of control limits, we need to design an energy function that results in persistently feasible safe control at all system states. However, designing such an energy function for high-dimensional nonlinear systems remains challenging. Considering the fact that there are redundant dynamics in high dimensional systems with respect to the safety specifications, this paper proposes a novel approach called abstract safe control. We propose a system abstraction method that enables the design of energy functions on a low-dimensional model. Then we can synthesize the energy function with respect to the low-dimensional model to ensure persistent feasibility. The resulting safe controller can be directly transferred to other systems with the same abstraction, e.g., when a robot arm holds different tools. The proposed approach is demonstrated on a 7-DoF robot arm (14 states) both in simulation and real-world. Our method always finds feasible control and achieves zero safety violations in 500 trials on 5 different systems.


Persistently Feasible Robust Safe Control by Safety Index Synthesis and Convex Semi-Infinite Programming

Wei, Tianhao, Kang, Shucheng, Zhao, Weiye, Liu, Changliu

arXiv.org Artificial Intelligence

Model mismatches prevail in real-world applications. Ensuring safety for systems with uncertain dynamic models is critical. However, existing robust safe controllers may not be realizable when control limits exist. And existing methods use loose over-approximation of uncertainties, leading to conservative safe controls. To address these challenges, we propose a control-limits aware robust safe control framework for bounded state-dependent uncertainties. We propose safety index synthesis to find a robust safe controller guaranteed to be realizable under control limits. And we solve for robust safe control via Convex Semi-Infinite Programming, which is the tightest formulation for convex bounded uncertainties and leads to the least conservative control. In addition, we analyze when and how safety can be preserved under unmodeled uncertainties. Experiment results show that our robust safe controller is always realizable under control limits and is much less conservative than strong baselines.


Safe Interactive Industrial Robots using Jerk-based Safe Set Algorithm

Liu, Ruixuan, Chen, Rui, Liu, Changliu

arXiv.org Artificial Intelligence

The need to increase the flexibility of production lines is calling for robots to collaborate with human workers. However, existing interactive industrial robots only guarantee intrinsic safety (reduce collision impact), but not interactive safety (collision avoidance), which greatly limited their flexibility. The issue arises from two limitations in existing control software for industrial robots: 1) lack of support for real-time trajectory modification; 2) lack of intelligent safe control algorithms with guaranteed collision avoidance under robot dynamics constraints. To address the first issue, a jerk-bounded position controller (JPC) was developed previously. This paper addresses the second limitation, on top of the JPC. Specifically, we introduce a jerk-based safe set algorithm (JSSA) to ensure collision avoidance while considering the robot dynamics constraints. The JSSA greatly extends the scope of the original safe set algorithm, which has only been applied for second-order systems with unbounded accelerations. The JSSA is implemented on the FANUC LR Mate 200id/7L robot and validated with HRI tasks. Experiments show that the JSSA can consistently keep the robot at a safe distance from the human while executing the designated task.


Control Barrier Functions-based Semi-Definite Programs (CBF-SDPs): Robust Safe Control For Dynamic Systems with Relative Degree Two Safety Indices

Grover, Jaskaran Singh, Liu, Changliu, Sycara, Katia

arXiv.org Artificial Intelligence

In this draft article, we consider the problem of achieving safe control of a dynamic system for which the safety index or (control barrier function (loosely)) has relative degree equal to two. We consider parameter affine nonlinear dynamic systems and assume that the parametric uncertainty is uniform and known a-priori or being updated online through an estimator/parameter adaptation law. Under this uncertainty, the usual CBF-QP safe control approach takes the form of a robust optimization problem. Both the right hand side and left hand side of the inequality constraints depend on the unknown parameter. With the given representation of uncertainty, the CBF-QP safe control ends up being a convex semi-infinite problem. Using two different philosophies, one based on weak duality and another based on the Lossless s-procedure, we arrive at identical SDP formulations of this robust CBF-QP problem. Thus we show that the problem of computing safe controls with known parametric uncertainty can be posed as a tractable convex problem and be solved online. (This is work in progress).


Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk

Chen, Tianrui, Gangrade, Aditya, Saligrama, Venkatesh

arXiv.org Machine Learning

We investigate a natural but surprisingly unstudied approach to the multi-armed bandit problem under safety risk constraints. Each arm is associated with an unknown law on safety risks and rewards, and the learner's goal is to maximise reward whilst not playing unsafe arms, as determined by a given threshold on the mean risk. We formulate a pseudo-regret for this setting that enforces this safety constraint in a per-round way by softly penalising any violation, regardless of the gain in reward due to the same. This has practical relevance to scenarios such as clinical trials, where one must maintain safety for each round rather than in an aggregated sense. We describe doubly optimistic strategies for this scenario, which maintain optimistic indices for both safety risk and reward. We show that schema based on both frequentist and Bayesian indices satisfy tight gap-dependent logarithmic regret bounds, and further that these play unsafe arms only logarithmically many times in total. This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema, and probing the domains in which their use is appropriate.


Joint Synthesis of Safety Certificate and Safe Control Policy using Constrained Reinforcement Learning

Ma, Haitong, Liu, Changliu, Li, Shengbo Eben, Zheng, Sifa, Chen, Jianyu

arXiv.org Artificial Intelligence

Safety is the major consideration in controlling complex dynamical systems using reinforcement learning (RL), where the safety certificate can provide provable safety guarantee. A valid safety certificate is an energy function indicating that safe states are with low energy, and there exists a corresponding safe control policy that allows the energy function to always dissipate. The safety certificate and the safe control policy are closely related to each other and both challenging to synthesize. Therefore, existing learning-based studies treat either of them as prior knowledge to learn the other, which limits their applicability with general unknown dynamics. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL. We do not rely on prior knowledge about either an available model-based controller or a perfect safety certificate. In particular, we formulate a loss function to optimize the safety certificate parameters by minimizing the occurrence of energy increases. By adding this optimization procedure as an outer loop to the Lagrangian-based constrained reinforcement learning (CRL), we jointly update the policy and safety certificate parameters and prove that they will converge to their respective local optima, the optimal safe policy and a valid safety certificate. We evaluate our algorithms on multiple safety-critical benchmark environments. The results show that the proposed algorithm learns provably safe policies with no constraint violation. The validity or feasibility of synthesized safety certificate is also verified numerically.


Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

Ma, Haitong, Liu, Changliu, Li, Shengbo Eben, Zheng, Sifa, Sun, Wenchao, Chen, Jianyu

arXiv.org Artificial Intelligence

--In the trial-and-error mechanism of reinforcement learning (RL), a notorious contradiction arises when we expect to learn a safe policy: how to learn a safe policy without enough data and prior model about the dangerous region? Existing methods mostly use the posterior penalty for dangerous actions, which means that the agent is not penalized until experiencing danger . This fact causes that the agent cannot learn a zero-violation policy even after convergence . Otherwise, it would not receive any penalty and lose the knowledge about danger . In this paper, we propose the safe set actor-critic (SSAC) algorithm, which confines the policy update using safety-oriented energy functions, or the safety indexes . The safety index is designed to increase rapidly for potentially dangerous actions, which allow us to locate the safe set on the action space, or the control safe set . Therefore, we can identify the dangerous actions prior to taking them, and further obtain a zero constraint-violation policy after convergence. We claim that we can learn the energy function in a model-free manner similar to learning a value function. By using the energy function transition as the constraint objective, we formulate a constrained RL problem. We prove that our Lagrangian-based solutions make sure that the learned policy will converge to the constrained optimum under some assumptions. The proposed algorithm is evaluated on both the complex simulation environments and a hardware-in-loop (HIL) experiment with a real controller from the autonomous vehicle. Experimental results suggest that the converged policy in all environments achieve zero constraint violation and comparable performance with model-based baseline. EINFORCEMENT learning has drawn rapidly growing attention for its superhuman learning capabilities in many sequential decision making problems like Go [1], Atari Games [2], and Starcraft [3].