Ingham, Michel D.
Risk-Averse Decision Making Under Uncertainty
Ahmadi, Mohamadreza, Rosolia, Ugo, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.
A large class of decision making under uncertainty problems can be described via Markov decision processes (MDPs) or partially observable MDPs (POMDPs), with application to artificial intelligence and operations research, among others. Traditionally, policy synthesis techniques are proposed such that a total expected cost or reward is minimized or maximized. However, optimality in the total expected cost sense is only reasonable if system behavior in the large number of runs is of interest, which has limited the use of such policies in practical mission-critical scenarios, wherein large deviations from the expected behavior may lead to mission failure. In this paper, we consider the problem of designing policies for MDPs and POMDPs with objectives and constraints in terms of dynamic coherent risk measures, which we refer to as the constrained risk-averse problem. For MDPs, we reformulate the problem into a infsup problem via the Lagrangian framework and propose an optimization-based method to synthesize Markovian policies. For MDPs, we demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. For POMDPs, we show that, if the coherent risk measures can be defined as a Markov risk transition mapping, an infinite-dimensional optimization can be used to design Markovian belief-based policies. For stochastic finite-state controllers (FSCs), we show that the latter optimization simplifies to a (finite-dimensional) DCP and can be solved by the DCCP framework. We incorporate these DCPs in a policy iteration algorithm to design risk-averse FSCs for POMDPs.
Constrained Risk-Averse Markov Decision Processes
Ahmadi, Mohamadreza, Rosolia, Ugo, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.
We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.
Partially Observable Games for Secure Autonomy
Ahmadi, Mohamadreza, Viswanathan, Arun A., Ingham, Michel D., Tan, Kymie, Ames, Aaron D.
Technology development efforts in autonomy and cyber-defense have been evolving independently of each other, over the past decade. In this paper, we report our ongoing effort to integrate these two presently distinct areas into a single framework. To this end, we propose the two-player partially observable stochastic game formalism to capture both high-level autonomous mission planning under uncertainty and adversarial decision making subject to imperfect information. We show that synthesizing sub-optimal strategies for such games is possible under finite-memory assumptions for both the autonomous decision maker and the cyber-adversary. We then describe an experimental testbed to evaluate the efficacy of the proposed framework.
Risk-Averse Planning Under Uncertainty
Ahmadi, Mohamadreza, Ono, Masahiro, Ingham, Michel D., Murray, Richard M., Ames, Aaron D.
Mohamadreza Ahmadi, Masahiro Ono, Michel D. Ingham, Richard M. Murray, and Aaron D. Ames Abstract -- We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. T o overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk. I NTRODUCTION With the rise of autonomous systems being deployed in real-world settings, the associated risk that stems from unknown and unforeseen circumstances is correspondingly on the rise. In particular, in safety-critical scenarios, such as aerospace applications, decision making should account for risk. For example, spacecraft control technology relies heavily on a relatively large and highly skilled mission operations team that generates detailed time-ordered and event-driven sequences of commands. This approach will not be viable in the future with increasing number of missions and a desire to limit the operations team and Deep Space Network (DSN) costs.
Model-Based Programming of Fault-Aware Systems
Williams, Brian C., Ingham, Michel D., Chung, Seung, Elliott, Paul, Hofbaur, Michael, Sullivan, Gregory T.
A wide range of sensor-rich, networked embedded systems are being created that must operate robustly for years in the face of novel failures by managing complex autonomic processes. Our objective is to revolutionize the way in which we control these new artifacts by creating reactive model-based programming languages that enable everyday systems to reason intelligently and enable machines to explore other worlds. The program's executive automatically coordinates system interactions to achieve these states, entertaining known and potential failures, using models of its constituents and environment. Model-based programming is being generalized to hybrid discrete-continuous systems and the coordination of networks of robotic vehicles.
Model-Based Programming of Fault-Aware Systems
Williams, Brian C., Ingham, Michel D., Chung, Seung, Elliott, Paul, Hofbaur, Michael, Sullivan, Gregory T.
A wide range of sensor-rich, networked embedded systems are being created that must operate robustly for years in the face of novel failures by managing complex autonomic processes. These systems are being composed, for example, into vast networks of space, air, ground, and underwater vehicles. Our objective is to revolutionize the way in which we control these new artifacts by creating reactive model-based programming languages that enable everyday systems to reason intelligently and enable machines to explore other worlds. A model-based program is state and fault aware; it elevates the programming task to specifying intended state evolutions of a system. The program's executive automatically coordinates system interactions to achieve these states, entertaining known and potential failures, using models of its constituents and environment. At the executive's core is a method, called CONFLICT-DIRECTED A*, which quickly prunes promising but infeasible solutions, using a form of one-shot learning. This approach has been demonstrated on a range of systems, including the National Aeronautics and Space Administration's Deep Space One probe. Model-based programming is being generalized to hybrid discrete-continuous systems and the coordination of networks of robotic vehicles.