Goto

Collaborating Authors

 Advanced Geothermal System (AGS)


Reinforcement Learning for Mixed Open-loop and Closed-loop Control

Neural Information Processing Systems

Closed-loop control relies on sensory feedback that is usually as(cid:173) sumed to be free . But if sensing incurs a cost, it may be cost(cid:173) effective to take sequences of actions in open-loop mode. We de(cid:173) scribe a reinforcement learning algorithm that learns to combine open-loop and closed-loop control when sensing incurs a cost. Al(cid:173) though we assume reliable sensors, use of open-loop control means that actions must sometimes be taken when the current state of the controlled system is uncertain. This is a special case of the hidden-state problem in reinforcement learning, and to cope, our algorithm relies on short-term memory.


Closed-Loop Koopman Operator Approximation

arXiv.org Artificial Intelligence

The Koopman operator allows a nonlinear system to be rewritten as an infinite-dimensional linear system by viewing it in terms of an infinite set of lifting functions instead of a state vector. The main feature of this representation is its linearity, making it compatible with existing linear systems theory. A finite-dimensional approximation of the Koopman operator can be identified from experimental data by choosing a finite subset of lifting functions, applying it to the data, and solving a least squares problem in the lifted space. Existing Koopman operator approximation methods are designed to identify open-loop systems. However, it is impractical or impossible to run experiments on some systems without a feedback controller. Unfortunately, the introduction of feedback control results in correlations between the system's input and output, making some plant dynamics difficult to identify if the controller is neglected. This paper addresses this limitation by introducing a method to identify a Koopman model of the closed-loop system, and then extract a Koopman model of the plant given knowledge of the controller. This is accomplished by leveraging the linearity of the Koopman representation of the system. The proposed approach widens the applicability of Koopman operator identification methods to a broader class of systems. The effectiveness of the proposed closed-loop Koopman operator approximation method is demonstrated experimentally using a Harmonic Drive gearbox exhibiting nonlinear vibrations.


Data-Driven Predictive Control Towards Multi-Agent Motion Planning With Non-Parametric Closed-Loop Behavior Learning

arXiv.org Artificial Intelligence

In many specific scenarios, accurate and effective system identification is a commonly encountered challenge in the model predictive control (MPC) formulation. As a consequence, the overall system performance could be significantly weakened in outcome when the traditional MPC algorithm is adopted under those circumstances when such accuracy is lacking. This paper investigates a non-parametric closed-loop behavior learning method for multi-agent motion planning, which underpins a data-driven predictive control framework. Utilizing an innovative methodology with closed-loop input/output measurements of the unknown system, the behavior of the system is learned based on the collected dataset, and thus the constructed non-parametric predictive model can be used to determine the optimal control actions. This non-parametric predictive control framework alleviates the heavy computational burden commonly encountered in the optimization procedures typically in alternate methodologies requiring open-loop input/output measurement data collection and parametric system identification. The proposed data-driven approach is also shown to preserve good robustness properties. Finally, a multi-UAV system is used to demonstrate the highly effective outcome of this promising development.


Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment

arXiv.org Artificial Intelligence

Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD), by delivering electrical pulses to the Basal Ganglia (BG) region of the brain. However, DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude; this energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects (e.g., gait impairment). In this work, we introduce an offline reinforcement learning (RL) framework, allowing the use of past clinical data to train an RL policy to adjust the stimulation amplitude in real time, with the goal of reducing energy use while maintaining the same level of treatment (i.e., control) efficacy as cDBS. Moreover, clinical protocols require the safety and performance of such RL controllers to be demonstrated ahead of deployments in patients. Thus, we also introduce an offline policy evaluation (OPE) method to estimate the performance of RL policies using historical data, before deploying them on patients. We evaluated our framework on four PD patients equipped with the RC+S DBS system, employing the RL controllers during monthly clinical visits, with the overall control efficacy evaluated by severity of symptoms (i.e., bradykinesia and tremor), changes in PD biomakers (i.e., local field potentials), and patient ratings. The results from clinical experiments show that our RL-based controller maintains the same level of control efficacy as cDBS, but with significantly reduced stimulation energy. Further, the OPE method is shown effective in accurately estimating and ranking the expected returns of RL controllers.


Soft Fluidic Closed-Loop Controller for Untethered Underwater Gliders

arXiv.org Artificial Intelligence

Abstract--Soft underwater robots typically explore bioinspired designs at the expense of power efficiency when compared to traditional underwater robots, which limits their practical use in real-world applications. A soft hydrostatic pressure sensor is configured as a bangbang controller actuating a swim bladder made from silicone balloons. Due to its simple design, low cost, and ease of fabrication using FDM printing and soft lithography, it serves as a starting point for the exploration of non-electronic underwater soft robots. A. Traditional Underwater Gliders Over the last several decades, underwater gliders have gained popularity among autonomous underwater vehicles (AUVs) [1], [2]. Compared to other AUVs, underwater gliders can achieve greater traveling distances, lower power consumption, and improved cost effectiveness.


FingerSLAM: Closed-loop Unknown Object Localization and Reconstruction from Visuo-tactile Feedback

arXiv.org Artificial Intelligence

In this paper, we address the problem of using visuo-tactile feedback for 6-DoF localization and 3D reconstruction of unknown in-hand objects. We propose FingerSLAM, a closed-loop factor graph-based pose estimator that combines local tactile sensing at finger-tip and global vision sensing from a wrist-mount camera. FingerSLAM is constructed with two constituent pose estimators: a multi-pass refined tactile-based pose estimator that captures movements from detailed local textures, and a single-pass vision-based pose estimator that predicts from a global view of the object. We also design a loop closure mechanism that actively matches current vision and tactile images to previously stored key-frames to reduce accumulated error. FingerSLAM incorporates the two sensing modalities of tactile and vision, as well as the loop closure mechanism with a factor graph-based optimization framework. Such a framework produces an optimized pose estimation solution that is more accurate than the standalone estimators. The estimated poses are then used to reconstruct the shape of the unknown object incrementally by stitching the local point clouds recovered from tactile images. We train our system on real-world data collected with 20 objects. We demonstrate reliable visuo-tactile pose estimation and shape reconstruction through quantitative and qualitative real-world evaluations on 6 objects that are unseen during training.


Discovering Closed-Loop Failures of Vision-Based Controllers via Reachability Analysis

arXiv.org Artificial Intelligence

Machine learning driven image-based controllers allow robotic systems to take intelligent actions based on the visual feedback from their environment. Understanding when these controllers might lead to system safety violations is important for their integration in safety-critical applications and engineering corrective safety measures for the system. Existing methods leverage simulation-based testing (or falsification) to find the failures of vision-based controllers, i.e., the visual inputs that lead to closed-loop safety violations. However, these techniques do not scale well to the scenarios involving high-dimensional and complex visual inputs, such as RGB images. In this work, we cast the problem of finding closed-loop vision failures as a Hamilton-Jacobi (HJ) reachability problem. Our approach blends simulation-based analysis with HJ reachability methods to compute an approximation of the backward reachable tube (BRT) of the system, i.e., the set of unsafe states for the system under vision-based controllers. Utilizing the BRT, we can tractably and systematically find the system states and corresponding visual inputs that lead to closed-loop failures. These visual inputs can be subsequently analyzed to find the input characteristics that might have caused the failure. Besides its scalability to high-dimensional visual inputs, an explicit computation of BRT allows the proposed approach to capture non-trivial system failures that are difficult to expose via random simulations. We demonstrate our framework on two case studies involving an RGB image-based neural network controller for (a) autonomous indoor navigation, and (b) autonomous aircraft taxiing.


Domain Randomization for Robust, Affordable and Effective Closed-loop Control of Soft Robots

arXiv.org Artificial Intelligence

Figure 1: From top to bottom: a) naรฏve RL with training directly on the real world; b) RL where the policy is trained in simulation Soft robotics is a rapidly developing field that has the and tested on the real world; c) Sim-to-Real transfer with potential to revolutionize how robots interact with their domain randomization increases robustness to modelling environment [1]. Unlike their rigid counterparts, soft robots errors and enables environmental constraints exploitation; are made from materials that can deform and adapt to d) posterior distributions over simulator parameters may be their surroundings, enabling them to perform novel and automatically inferred from real-world data for use with DR. unprecedented tasks in fields such as healthcare [2] and exploration [3]. However, controlling the complex dynamics of continuous soft robots is a challenging task, as an accurate Many attempts have been made to control soft devices modelling requires infinite degrees of freedom (DoF) [4] and through model-based techniques, also pushed by the advancement nonlinear dynamics parameters that are difficult to accurately of modelling techniques [6].


Closed-loop Error Correction Learning Accelerates Experimental Discovery of Thermoelectric Materials

arXiv.org Artificial Intelligence

The exploration of thermoelectric materials is challenging considering the large materials space, combined with added exponential degrees of freedom coming from doping and the diversity of synthetic pathways. Here we seek to incorporate historical data and update and refine it using experimental feedback by employing error-correction learning (ECL). We thus learn from prior datasets and then adapt the model to differences in synthesis and characterization that are otherwise difficult to parameterize. We then apply this strategy to discovering thermoelectric materials where we prioritize synthesis at temperatures < 300{\deg}C. We document a previously unreported chemical family of thermoelectric materials, PbSe:SnSb, finding that the best candidate in this chemical family, 2 wt% SnSb doped PbSe, exhibits a power factor more than 2x that of PbSe. Our investigations show that our closed-loop experimentation strategy reduces the required number of experiments to find an optimized material by as much as 3x compared to high-throughput searches powered by state-of-the-art machine learning models. We also observe that this improvement is dependent on the accuracy of prior in a manner that exhibits diminishing returns, and after a certain accuracy is reached, it is factors associated with experimental pathways that dictate the trends.


Closed-loop Analysis of Vision-based Autonomous Systems: A Case Study

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) are increasingly used in safety-critical autonomous systems as perception components processing high-dimensional image data. Formal analysis of these systems is particularly challenging due to the complexity of the perception DNNs, the sensors (cameras), and the environment conditions. We present a case study applying formal probabilistic analysis techniques to an experimental autonomous system that guides airplanes on taxiways using a perception DNN. We address the above challenges by replacing the camera and the network with a compact probabilistic abstraction built from the confusion matrices computed for the DNN on a representative image data set. We also show how to leverage local, DNN-specific analyses as run-time guards to increase the safety of the overall system. Our findings are applicable to other autonomous systems that use complex DNNs for perception.