When controlling dynamic systems, such as mobile robots in uncertain environments, there is a trade off between risk and reward. For example, a race car can turn a corner faster by taking a more challenging path. This paper proposes a new approach to planning a control sequence with a guaranteed risk bound. Given a stochastic dynamic model, the problem is to find a control sequence that optimizes a performance metric, while satisfying chance constraints i.e. constraints on the upper bound of the probability of failure. We propose a two-stage optimization approach, with the upper stage optimizing the risk allocation and the lower stage calculating the optimal control sequence that maximizes reward. In general, the upper-stage is a non-convex optimization problem, which is hard to solve. We develop a new iterative algorithm for this stage that efficiently computes the risk allocation with a small penalty to optimality. The algorithm is implemented and tested on the autonomous underwater vehicle (AUV) depth planning problem, and demonstrates a substantial improvement in computation cost and suboptimality, compared to the prior arts.
We present a novel method for information-theoretic exploration, leveraging recent work on mapping and localization. We describe exploration as the constrained optimization problem of computing a trajectory to minimize posterior map error, subject to the constraints of traveling through a set of sensing locations to ensure map coverage. This trajectory is found by reducing the map to a skeleton graph and searching for a minimum entropy tour through the graph. We describe how a specific factorization of the map covariance allows the reuse of EKF updates during the optimization, giving an efficient gradient ascent search for the maximum information gain tour through sensing locations, where each tour naturally incorporates revisiting well-known map regions. By generating incrementally larger tours as the exploration finds new regions of the environment, we demonstrate that our approach can perform autonomous exploration with improved accuracy.
With the aim of fluency and efficiency in human-robot teams, we have developed a cognitive architecture based on the neuropsychological principles of anticipation and perceptual simulation through top-down biasing. An instantiation of this architecture was implemented on a non-anthropomorphic robotic lamp, performing in a human-robot collaborative task. In a human-subject study, in which the robot works on a joint task with untrained subjects, we find our approach to be significantly more efficient and fluent than in a comparable system without anticipatory perceptual simulation. We also show the robot and the human to be increasingly contributing at a similar rate. Through self-report, we find significant differences between the two conditions in the sense of team fluency, the team's improvement over time, and the robot's contribution to the efficiency and fluency. We also find difference in verbal attitudes towards the robot: most notably, subjects working with the anticipatory robot attribute more positive and more human qualities to the robot, but display increased self-blame and self-deprecation.
In order to interact successfully in social situations, a robot must be able to observe others' actions and base its own behavior on its beliefs about their intentions. Many interactions take place in dynamic environments, and the outcomes of people's or the robot's actions may be time-dependent. In this paper, such interactions are modeled as a POMDP with a time index as part of the state, resulting in a fully Markov model with a potentially very large state space. The complexity of finding even an approximate solution often limits POMDP's practical applicability for large problems. This difficulty is addressed through the development of an algorithm for aggregating states in POMDPs with a time-indexed state space. States that represent the same physical configuration of the environment at different times are chosen to be combined using reward-based metrics, preserving the structure of the original model while producing a smaller model that is faster to solve. We demonstrate that solving the aggregated model produces a policy with performance comparable to the policy from the original model. The example domains used are a simulated elevator-riding task and a simulated driving task based on data collected from human drivers.
We describe a novel integration of Planning with Probabilistic State Estimation and Execution. The resulting system is a unified representational and computational framework based on declarative models and constraintbased temporal plans. The work is motivated by the need to explore the oceans more cost-effectively through the use of Autonomous Underwater Vehicles (AUV), requiring them to be goal-directed, perceptive, adaptive and robust in the context of dynamic and uncertain conditions. The novelty of our approach is in integrating deliberation and reaction over different temporal and functional scopes within a single model, and in breaking new ground in oceanography by allowing for precise sampling within a feature of interest using an autonomous robot. The system is general-purpose and adaptable to other ocean going and terrestrial platforms.
How can we facilitate human-robot teamwork? The teamwork literature has identified the need to know the capabilities of teammates. How can we integrate the knowledge of another agent's capabilities for a justifiably intelligent teammate? This paper describes extensions to the cognitive architecture, ACT-R, and the use of artificial intelligence (AI) and cognitive science approaches to produce a more cognitively-plausible, autonomous robotic system that "mentally" simulates the decision-making of its teammate. The extensions to ACT-R added capabilities to interact with the real world through the robot's sensors and effectors and simulate the decision-making of its teammate. The AI applications provided visual sensor capabilities by methods clearly different than those used by humans. The integration of these approaches into intelligent team-based behavior is demonstrated on a mobile robot. Our "TeamBot" matches the descriptive work and theories on human teamwork. We illustrate our approach in a spatial, team-oriented task of a guard force responding appropriately to an alarm condition that requires the human and robot team to "man" two guard stations as soon as possible after the alarm.
Email client software is widely used for personal task management, a purpose for which it was not designed and is poorly suited. Past attempts to remedy the problem have focused on adding task management features to the client UI. RADAR uses an alternative approach modeled on a trusted human assistant who reads mail, identifies task-relevant message content, and helps manage and execute tasks. This paper describes the integration of diverse AI technologies and presents results from human evaluation studies comparing RADAR user performance to unaided COTS tool users and users partnered with a human assistant. As machine learning plays a central role in many system components, we also compare versions of RADAR with and without learning. Our tests show a clear advantage for learning-enabled RADAR over all other test conditions.
We present a computational model, MoralDM, which integrates several AI techniques in order to model recent psychological findings on moral decision-making. Current theories of moral decision-making extend beyond pure utilitarian models by relying on contextual factors that vary with culture. MoralDM uses a natural language system to produce formal representations from psychological stimuli, to reduce tailorability. The impacts of secular versus sacred values are modeled via qualitative reasoning, using an order of magnitude representation. MoralDM uses a combination of first-principles reasoning and analogical reasoning to determine consequences and utilities when making moral judgments. We describe how MoralDM works and show that it can model psychological results and improve its performance via accumulating examples.
POIROT is an integration framework for combining machine learning mechanisms to learn hierarchical models of web services procedures from a single or very small set of demonstration examples. The system is organized around a shared representation language for communications with a central hypothesis blackboard. Component learning systems share semantic representations of their hypotheses (generalizations) and inferences about demonstration traces. To further the process, components may generate learning goals for other learning components. POIROT's learners or hypothesis formers develop workflows that include order dependencies, subgoals, and decision criteria for selecting or prioritizing subtasks and service parameters. Hypothesis evaluators, guided by POIROT's meta-control component, plan experiments to confirm or disconfirm hypotheses extracted from these learning products. Collectively, they create methods that POIROT can use to reproduce the demonstration and solve similar problems. After its first phase of development, POIROT has demonstrated it can learn some moderately complex hierarchical task models from semantic traces of user-generated service transaction sequences at a level that is approaching human performance on the same learning task.
Spatial scaffolding is a naturally occurring human teaching behavior, in which teachers use their bodies to spatially structure the learning environment to direct the attention of the learner. Robotic systems can take advantage of simple, highly reliable spatial scaffolding cues to learn from human teachers. We present an integrated robotic architecture that combines social attention and machine learning components to learn tasks effectively from natural spatial scaffolding interactions with human teachers. We evaluate the performance of this architecture in comparison to human learning data drawn from a novel study of the use of embodied cues in human task learning and teaching behavior. This evaluation provides quantitative evidence for the utility of spatial scaffolding to learning systems. In addition, this evaluation supported the construction of a novel, interactive demonstration of a humanoid robot taking advantage of spatial scaffolding cues to learn from natural human teaching behavior.