Russell, Rebecca
Deep Modeling of Non-Gaussian Aleatoric Uncertainty
Acharya, Aastha, Lee, Caleb, D'Alonzo, Marissa, Shamwell, Jared, Ahmed, Nisar R., Russell, Rebecca
Deep learning offers promising new ways to accurately model aleatoric uncertainty in robotic estimation systems, particularly when the uncertainty distributions do not conform to traditional assumptions of being fixed and Gaussian. In this study, we formulate and evaluate three fundamental deep learning approaches for conditional probability density modeling to quantify non-Gaussian aleatoric uncertainty: parametric, discretized, and generative modeling. We systematically compare the respective strengths and weaknesses of these three methods on simulated non-Gaussian densities as well as on real-world terrain-relative navigation data. Our results show that these deep learning methods can accurately capture complex uncertainty patterns, highlighting their potential for improving the reliability and robustness of estimation systems.
Surrogate Neural Networks for Efficient Simulation-based Trajectory Planning Optimization
Ruff, Evelyn, Russell, Rebecca, Stoeckle, Matthew, Miotto, Piero, How, Jonathan P.
This paper presents a novel methodology that uses surrogate models in the form of neural networks to reduce the computation time of simulation-based optimization of a reference trajectory. Simulation-based optimization is necessary when there is no analytical form of the system accessible, only input-output data that can be used to create a surrogate model of the simulation. Like many high-fidelity simulations, this trajectory planning simulation is very nonlinear and computationally expensive, making it challenging to optimize iteratively. Through gradient descent optimization, our approach finds the optimal reference trajectory for landing a hypersonic vehicle. In contrast to the large datasets used to create the surrogate models in prior literature, our methodology is specifically designed to minimize the number of simulation executions required by the gradient descent optimizer. We demonstrated this methodology to be more efficient than the standard practice of hand-tuning the inputs through trial-and-error or randomly sampling the input parameter space. Due to the intelligently selected input values to the simulation, our approach yields better simulation outcomes that are achieved more rapidly and to a higher degree of accuracy. Optimizing the hypersonic vehicle's reference trajectory is very challenging due to the simulation's extreme nonlinearity, but even so, this novel approach found a 74% better-performing reference trajectory compared to nominal, and the numerical results clearly show a substantial reduction in computation time for designing future trajectories.
Learning to Forecast Aleatoric and Epistemic Uncertainties over Long Horizon Trajectories
Acharya, Aastha, Russell, Rebecca, Ahmed, Nisar R.
Giving autonomous agents the ability to forecast their own outcomes and uncertainty will allow them to communicate their competencies and be used more safely. We accomplish this by using a learned world model of the agent system to forecast full agent trajectories over long time horizons. Real world systems involve significant sources of both aleatoric and epistemic uncertainty that compound and interact over time in the trajectory forecasts. We develop a deep generative world model that quantifies aleatoric uncertainty while incorporating the effects of epistemic uncertainty during the learning process. We show on two reinforcement learning problems that our uncertainty model produces calibrated outcome uncertainty estimates over the full trajectory horizon.
Symmetry Detection in Trajectory Data for More Meaningful Reinforcement Learning Representations
D'Alonzo, Marissa, Russell, Rebecca
Knowledge of the symmetries of reinforcement learning (RL) systems can be used to create compressed and semantically meaningful representations of a low-level state space. We present a method of automatically detecting RL symmetries directly from raw trajectory data without requiring active control of the system. Our method generates candidate symmetries and trains a recurrent neural network (RNN) to discriminate between the original trajectories and the transformed trajectories for each candidate symmetry. The RNN discriminator's accuracy for each candidate reveals how symmetric the system is under that transformation. This information can be used to create high-level representations that are invariant to all symmetries on a dataset level and to communicate properties of the RL behavior to users. We show in experiments on two simulated RL use cases (a pusher robot and a UAV flying in wind) that our method can determine the symmetries underlying both the environment physics and the trained RL policy.
Learning and Understanding a Disentangled Feature Representation for Hidden Parameters in Reinforcement Learning
Reale, Christopher, Russell, Rebecca
Hidden parameters are latent variables in reinforcement learning (RL) environments that are constant over the course of a trajectory. Understanding what, if any, hidden parameters affect a particular environment can aid both the development and appropriate usage of RL systems. We present an unsupervised method to map RL trajectories into a feature space where distance represents the relative difference in system behavior due to hidden parameters. Our approach disentangles the effects of hidden parameters by leveraging a recurrent neural network (RNN) world model as used in model-based RL. First, we alter the standard world model training algorithm to isolate the hidden parameter information in the world model memory. Then, we use a metric learning approach to map the RNN memory into a space with a distance metric approximating a bisimulation metric with respect to the hidden parameters. The resulting disentangled feature space can be used to meaningfully relate trajectories to each other and analyze the hidden parameter. We demonstrate our approach on four hidden parameters across three RL environments. Finally we present two methods to help identify and understand the effects of hidden parameters on systems.
Explaining Conditions for Reinforcement Learning Behaviors from Real and Imagined Data
Acharya, Aastha, Russell, Rebecca, Ahmed, Nisar R.
The deployment of reinforcement learning (RL) in the real world comes with challenges in calibrating user trust and expectations. As a step toward developing RL systems that are able to communicate their competencies, we present a method of generating human-interpretable abstract behavior models that identify the experiential conditions leading to different task execution strategies and outcomes. Our approach consists of extracting experiential features from state representations, abstracting strategy descriptors from trajectories, and training an interpretable decision tree that identifies the conditions most predictive of different RL behaviors. We demonstrate our method on trajectory data generated from interactions with the environment and on imagined trajectory data that comes from a trained probabilistic world model in a model-based RL setting.
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks
Harer, Jacob, Ozdemir, Onur, Lazovich, Tomo, Reale, Christopher, Russell, Rebecca, Kim, Louis, chin, peter
Motivated by the problem of automated repair of software vulnerabilities, we propose an adversarial learning approach that maps from one discrete source domain to another target domain without requiring paired labeled examples or source and target domains to be bijections. We demonstrate that the proposed adversarial learning approach is an effective technique for repairing software vulnerabilities, performing close to seq2seq approaches that require labeled pairs. The proposed Generative Adversarial Network approach is application-agnostic in that it can be applied to other problems similar to code repair, such as grammar correction or sentiment translation.
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks
Harer, Jacob, Ozdemir, Onur, Lazovich, Tomo, Reale, Christopher, Russell, Rebecca, Kim, Louis, chin, peter
Motivated by the problem of automated repair of software vulnerabilities, we propose an adversarial learning approach that maps from one discrete source domain to another target domain without requiring paired labeled examples or source and target domains to be bijections. We demonstrate that the proposed adversarial learning approach is an effective technique for repairing software vulnerabilities, performing close to seq2seq approaches that require labeled pairs. The proposed Generative Adversarial Network approach is application-agnostic in that it can be applied to other problems similar to code repair, such as grammar correction or sentiment translation.