Goto

Collaborating Authors

 double integrator


Probabilistic Safety Guarantee for Stochastic Control Systems Using Average Reward MDPs

arXiv.org Artificial Intelligence

Safety in stochastic control systems, which are subject to random noise with a known probability distribution, aims to compute policies that satisfy predefined operational constraints with high confidence throughout the uncertain evolution of the state variables. The unpredictable evolution of state variables poses a significant challenge for meeting predefined constraints using various control methods. To address this, we present a new algorithm that computes safe policies to determine the safety level across a finite state set. This algorithm reduces the safety objective to the standard average reward Markov Decision Process (MDP) objective. This reduction enables us to use standard techniques, such as linear programs, to compute and analyze safe policies. We validate the proposed method numerically on the Double Integrator and the Inverted Pendulum systems. Results indicate that the average-reward MDPs solution is more comprehensive, converges faster, and offers higher quality compared to the minimum discounted-reward solution. Keywords: Safety Critical Systems, Robotics, Average Reward MDPs, Stochastic Control.


Verifiable Safety Q-Filters via Hamilton-Jacobi Reachability and Multiplicative Q-Networks

arXiv.org Artificial Intelligence

-- Recent learning-based safety filters have outperformed conventional methods, such as hand-crafted Control Barrier Functions (CBFs), by effectively adapting to complex constraints. However, these learning-based approaches lack formal safety guarantees. In this work, we introduce a verifiable model-free safety filter based on Hamilton-Jacobi reachability analysis. Our primary contributions include: 1) extending verifiable self-consistency properties for Q value functions, 2) proposing a multiplicative Q-network structure to mitigate zero-sublevel-set shrinkage issues, and 3) developing a verification pipeline capable of soundly verifying these self-consistency properties. Our proposed approach successfully synthesizes formally verified, model-free safety certificates across four standard safe-control benchmarks.


How to Train Your Neural Control Barrier Function: Learning Safety Filters for Complex Input-Constrained Systems

arXiv.org Artificial Intelligence

Control barrier functions (CBF) have become popular as a safety filter to guarantee the safety of nonlinear dynamical systems for arbitrary inputs. However, it is difficult to construct functions that satisfy the CBF constraints for high relative degree systems with input constraints. To address these challenges, recent work has explored learning CBFs using neural networks via neural CBF (NCBF). However, such methods face difficulties when scaling to higher dimensional systems under input constraints. In this work, we first identify challenges that NCBFs face during training. Next, to address these challenges, we propose policy neural CBF (PNCBF), a method of constructing CBFs by learning the value function of a nominal policy, and show that the value function of the maximum-over-time cost is a CBF. We demonstrate the effectiveness of our method in simulation on a variety of systems ranging from toy linear systems to an F-16 jet with a 16-dimensional state space. Finally, we validate our approach on a two-agent quadcopter system on hardware under tight input constraints.


Bang-Bang Boosting of RRTs

arXiv.org Artificial Intelligence

This paper presents methods for dramatically improving the performance of sampling-based kinodynamic planners. The key component is the first-known complete, exact steering method that produces a time-optimal trajectory between any states for a vector of synchronized double integrators. This method is applied in three ways: 1) to generate RRT edges that quickly solve the two-point boundary-value problems, 2) to produce a (quasi)metric for more accurate Voronoi bias in RRTs, and 3) to iteratively time-optimize a given collision-free trajectory. Experiments are performed for state spaces with up to 2000 dimensions, resulting in improved computed trajectories and orders of magnitude computation time improvements over using ordinary metrics and constant controls.


Trajectory tracking control of the second-order chained form system by using state transitions

arXiv.org Artificial Intelligence

This paper proposes a novel control approach composed of sinusoidal reference trajectories and trajectory tracking controller for the second-order chained form system. The system is well-known as a canonical form for a class of second-order nonholonomic systems obtained by appropriate transformation of the generalized coordinates and control inputs. The system is decomposed into three subsystems, two of them are the so-called double integrators and the other subsystem is a nonlinear system depending on one of the double integrators. The double integrators are linearly controllable, which enables to transit the value of the position state in order to modify the nature of the nonlinear system that depends on them. Transiting the value to "one" corresponds to modifying the nonlinear subsystem into the double integrator; transiting the value to "zero" corresponds to modifying the nonlinear subsystem into an uncontrollable linear autonomous system. Focusing on this nature, this paper proposes a feedforward control strategy. Furthermore, from the perspective of practical usefulness, the control strategy is extended into trajectory tracking control by using proportional-derivative feedback. The effectiveness of the proposed method is demonstrated through several numerical experiments including an application to an underactuated manipulator.


Learning Density Distribution of Reachable States for Autonomous Systems

arXiv.org Artificial Intelligence

State density distribution, in contrast to worst-case reachability, can be leveraged for safety-related problems to better quantify the likelihood of the risk for potentially hazardous situations. In this work, we propose a data-driven method to compute the density distribution of reachable states for nonlinear and even black-box systems. Our semi-supervised approach learns system dynamics and the state density jointly from trajectory data, guided by the fact that the state density evolution follows the Liouville partial differential equation. With the help of neural network reachability tools, our approach can estimate the set of all possible future states as well as their density. Moreover, we could perform online safety verification with probability ranges for unsafe behaviors to occur. We use an extensive set of experiments to show that our learned solution can produce a much more accurate estimate on density distribution, and can quantify risks less conservatively and flexibly comparing with worst-case analysis.


Nonmodular architectures of cognitive systems based on active inference

arXiv.org Artificial Intelligence

In psychology and neuroscience it is common to describe cognitive systems as input/output devices where perceptual and motor functions are implemented in a purely feedforward, open-loop fashion. On this view, perception and action are often seen as encapsulated modules with limited interaction between them. While embodied and enactive approaches to cognitive science have challenged the idealisation of the brain as an input/output device, we argue that even the more recent attempts to model systems using closed-loop architectures still heavily rely on a strong separation between motor and perceptual functions. Previously, we have suggested that the mainstream notion of modularity strongly resonates with the separation principle of control theory. In this work we present a minimal model of a sensorimotor loop implementing an architecture based on the separation principle. We link this to popular formulations of perception and action in the cognitive sciences, and show its limitations when, for instance, external forces are not modelled by an agent. These forces can be seen as variables that an agent cannot directly control, i.e., a perturbation from the environment or an interference caused by other agents. As an alternative approach inspired by embodied cognitive science, we then propose a nonmodular architecture based on the active inference framework. We demonstrate the robustness of this architecture to unknown external inputs and show that the mechanism with which this is achieved in linear models is equivalent to integral control.


Traveing Salesperson Problems for a double integrator

arXiv.org Artificial Intelligence

In this paper we propose some novel path planning strategies for a double integrator with bounded velocity and bounded control inputs. First, we study the following version of the Traveling Salesperson Problem (TSP): given a set of points in $\real^d$, find the fastest tour over the point set for a double integrator. We first give asymptotic bounds on the time taken to complete such a tour in the worst-case. Then, we study a stochastic version of the TSP for double integrator where the points are randomly sampled from a uniform distribution in a compact environment in $\real^2$ and $\real^3$. We propose novel algorithms that perform within a constant factor of the optimal strategy with high probability. Lastly, we study a dynamic TSP: given a stochastic process that generates targets, is there a policy which guarantees that the number of unvisited targets does not diverge over time? If such stable policies exist, what is the minimum wait for a target? We propose novel stabilizing receding-horizon algorithms whose performances are within a constant factor from the optimum with high probability, in $\real^2$ as well as $\real^3$. We also argue that these algorithms give identical performances for a particular nonholonomic vehicle, Dubins vehicle.