Doherty, Patrick


Deep Learning Quadcopter Control via Risk-Aware Active Learning

AAAI Conferences

Modern optimization-based approaches to control increasingly allow automatic generation of complex behavior from only a model and an objective. Recent years has seen growing interest in fast solvers to also allow real-time operation on robots, but the computational cost of such trajectory optimization remains prohibitive for many applications. In this paper we examine a novel deep neural network approximation and validate it on a safe navigation problem with a real nano-quadcopter. As the risk of costly failures is a major concern with real robots, we propose a risk-aware resampling technique. Contrary to prior work this active learning approach is easy to use with existing solvers for trajectory optimization, as well as deep learning. We demonstrate the efficacy of the approach on a difficult collision avoidance problem with non-cooperative moving obstacles. Our findings indicate that the resulting neural network approximations are least 50 times faster than the trajectory optimizer while still satisfying the safety requirements. We demonstrate the potential of the approach by implementing a synthesized deep neural network policy on the nano-quadcopter microcontroller.


Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization

AAAI Conferences

Reinforcement learning for robot control tasks in continuous environments is a challenging problem due to the dimensionality of the state and action spaces, time and resource costs for learning with a real robot as well as constraints imposed for its safe operation. In this paper we propose a model-based reinforcement learning approach for continuous environments with constraints. The approach combines model-based reinforcement learning with recent advances in approximate optimal control. This results in a bounded-rationality agent that makes decisions in real-time by efficiently solving a sequence of constrained optimization problems on learned sparse Gaussian process models. Such a combination has several advantages. No high-dimensional policy needs to be computed or stored while the learning problem often reduces to a set of lower-dimensional models of the dynamics. In addition, hard constraints can easily be included and objectives can also be changed in real-time to allow for multiple or dynamic tasks. The efficacy of the approach is demonstrated on both an extended cart pole domain and a challenging quadcopter navigation task using real data.


EfficientIDC: A Faster Incremental Dynamic Controllability Algorithm

AAAI Conferences

The exact duration of an action generally cannot be predicted in advance. Temporal planning therefore tends to use upper bounds on durations, with the explicit or implicit assumption that if an action happens to be executed more quickly, the plan will still succeed. However, this assumption is often false: If we finish cooking too early, the dinner will be cold before everyone is at home and can eat. Simple Temporal Problems with Uncertainty (STPUs) allow us to model such situations. An STPU-based planner must then verify that the networks it generates are executable, captured by the property of dynamic controllability. The FastIDC algorithm can do this incrementally during planning. In this paper we show that the FastIDC method can result in traversing part of a temporal network multiple times, with constraints slowly tightening towards their final values. We then present a new algorithm that uses additional analysis together with a different traversal strategy to avoid this behavior. The new algorithm has a guaranteed time complexity lower than that of FastIDC and is proven sound and complete.


Exploiting Fully Observable and Deterministic Structures in Goal POMDPs

AAAI Conferences

When parts of the states in a goal POMDP are fully observable and some actions are deterministic it is possible to take advantage of these properties to efficiently generate approximate solutions. Actions that deterministically affect the fully observable component of the world state can be abstracted away and combined into macro actions, permitting a planner to converge more quickly. This processing can be separated from the main search procedure, allowing us to leverage existing POMDP solvers. Theoretical results show how a POMDP can be analyzed to identify the exploitable properties and formal guarantees are provided showing that the use of macro actions preserves solvability. The efficiency of the method is demonstrated with examples when used in combination with existing POMDP solvers.


Incremental Dynamic Controllability Revisited

AAAI Conferences

Simple Temporal Networks with Uncertainty (STNUs) allow the representation of temporal problems where some durations are determined by nature, as is often the case for actions in planning. As such networks are generated it is essential to verify that they are dynamically controllable -- executable regardless of the outcomes of uncontrollable durations -- and to convert them to a dispatchable form. The previously published FastIDC algorithm achieves this incrementally and can therefore be used efficiently during plan construction. In this paper we show that FastIDC is not sound when new constraints are added, sometimes labeling networks as dynamically controllable when they are not. We analyze the algorithm, pinpoint the cause, and show how the algorithm can be modified to correctly detect uncontrollable networks.


Temporal Composite Actions with Constraints

AAAI Conferences

Complex mission or task specification languages play a fundamentally important role in human/robotic interaction. In realistic scenarios such as emergency response, specifying temporal, resource and other constraints on a mission is an essential component due to the dynamic and contingent nature of the operational environments. It is also desirable that in addition to having a formal semantics, the language should be sufficiently expressive, pragmatic and abstract. The main goal of this paper is to propose a mission specification language that meets these requirements. It is based on extending both the syntax and semantics of a well-established formalism for reasoning about action and change, Temporal Action Logic (TAL), in order to represent temporal composite actions with constraints. Fixpoints are required to specify loops and recursion in the extended language. The results include a sound and complete proof theory for this extension. To ensure that the composite language constructs are adequately grounded in the pragmatic operation of robotic systems, Task Specification Trees (TSTs) and their mapping to these constructs are proposed. The expressive and pragmatic adequacy of this approach is demonstrated using an emergency response scenario.


Reports of the AAAI 2011 Spring Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, presented the 2011 Spring Symposium Series Monday through Wednesday, March 21–23, 2011 at Stanford University. The titles of the eight symposia were AI and Health Communication, Artificial Intelligence and Sustainable Design, AI for Business Agility, Computational Physiology, Help Me Help You: Bridging the Gaps in Human-Agent Collaboration, Logical Formalizations of Commonsense Reasoning, Multirobot Systems and Physical Data Structures, and Modeling Complex Adaptive Systems As If They Were Voting Processes.


Reports of the AAAI 2011 Spring Symposia

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University’s Department of Computer Science, presented the 2011 Spring Symposium Series Monday through Wednesday, March 21–23, 2011 at Stanford University. The titles of the eight symposia were AI and Health Communication, Artificial Intelligence and Sustainable Design, AI for Business Agility, Computational Physiology, Help Me Help You: Bridging the Gaps in Human-Agent Collaboration, Logical Formalizations of Commonsense Reasoning, Multirobot Systems and Physical Data Structures, and Modeling Complex Adaptive Systems As If They Were Voting Processes. This report summarizes the eight symposia.



Choosing Path Replanning Strategies for Unmanned Aircraft Systems

AAAI Conferences

Unmanned aircraft systems use a variety of techniques to plan collision-free flight paths given a map of obstacles and no-fly zones. However, maps are not perfect and obstacles may change over time or be detected during flight, which may invalidate paths that the aircraft is already following. Thus, dynamic in-flight replanning is required. Numerous strategies can be used for replanning, where the time requirements and the plan quality associated with each strategy depend on the environment around the original flight path. In this paper, we investigate the use of machine learning techniques, in particular support vector machines, to choose the best possible replanning strategy depending on the amount of time available. The system has been implemented, integrated and tested in hardware-in-the-loop simulation with a Yamaha RMAX helicopter platform.