Goto

Collaborating Authors

 Learning Graphical Models


Fixing a Hole in Lexicalized Plan Recognition

AAAI Conferences

Previous work has suggested the use of lexicalized grammars for probabilistic plan recognition. Such grammars allow the domain builder to delay commitment to hypothesizing high level goals in order to reduce computational costs. However this delay has limitations. In the case of only partial observation traces, delaying commitment can prevent such algorithms from forming correct conclusions about some goals. This paper presents a heuristic metric to address this limitation. It advocates computing the maximum change in conditional probability across all the computed explanations given the observations explicitly considering a goal of interest.


Hierarchical Skills and Skill-based Representation

AAAI Conferences

Autonomous robots demand complex behavior to deal with unstructured environments. To meet these expectations, a robot needs to address a suite of problems associated with long term knowledge acquisition, representation, and execution in the presence of partial information. In this paper, we address these issues by the acquisition of broad, domain general skills using an intrinsically motivated reward function. We show how these skills can be represented compactly and used hierarchically to obtain complex manipulation skills. We further present a Bayesian model using the learned skills to model objects in the world, in terms of the actions they afford. We argue that our knowledge representation allows a robot to both predict the dynamics of objects in the world as well as recognize them.


InfoMax Control for Acoustic Exploration of Objects by a Mobile Robot

AAAI Conferences

Recently, information gain has been proposed as a candidate intrinsic motivation for lifelong learning agents that may not always have a specific task.  In the InfoMax control framework, reinforcement learning is used to find a control policy for a POMDP in which movement and sensing actions are selected to reduce Shannon entropy as quickly as possible. In this study, we implement InfoMax control on a robot which can move between objects and perform sound-producing manipulations on them.  We formulate a novel latent variable mixture model for acoustic similarities and learn InfoMax polices that allow the robot to rapidly reduce uncertainty about the categories of the objects in a room. We find that InfoMax with our improved acoustic model leads to policies which lead to high classification accuracy.  Interestingly, we also find that with an insufficient model, the InfoMax policy eventually learns to "bury its head in the sand" to avoid getting additional evidence that might increase uncertainty.  We discuss the implications of this finding for InfoMax as a principle of intrinsic motivation in lifelong learning agents.


Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery

AAAI Conferences

Skill discovery algorithms in reinforcement learning typically identify single states or regions in state space that correspond to potential task-specific subgoals. However, such methods do not directly address the question of how many distinct skills are appropriate for solving the tasks that the agent faces. This can be highly inefficient when many identified subgoals correspond to the same underlying skill, but are all used in- dividually as skill goals. Furthermore, skills created in this manner are often only transferable to tasks that share iden- tical state spaces, since corresponding subgoals across tasks are not merged into a single skill goal. We show that these problems can be overcome by clustering subgoal data defined in an agent-space and using the resulting clusters as templates for skill termination conditions. Clustering via a Dirichlet process mixture model is used to discover a minimal, suffi- cient collection of portable skills.


The Importance of Selective Knowledge Transfer for Lifelong Learning

AAAI Conferences

Versatile agents situated in rich, dynamic environments must It is not necessarily possible to select the source knowledge be capable of continually learning and refining their knowledge to transfer to a new target task by examining only the surface through experience. These agents will face a variety of similarities between the tasks. The selection must support learning tasks, and can transfer knowledge between tasks to the process of knowledge transfer by choosing source improve performance and accelerate learning. In this context, knowledge based on whether it will transfer well to the target a learning task can be as simple as discovering the effects task. In our previous work, we developed methods that of an operator on the environment, or as complex as accomplishing identify the source knowledge to transfer based on this concept a specific goal -- anything that can be learned of transferability to the target task. Intuitively, transferability can be considered a task. As the agent experiences and learns is the amount that the transferred information is a model for each task, it gains access to new data and knowledge.


A Bayesian Concept Learning Approach to Crowdsourcing

AAAI Conferences

We develop a Bayesian approach to concept learning for crowdsourcing applications. A probabilistic belief over possible concept definitions is maintained and updated according to (noisy) observations from experts, whose behaviors are modeled using discrete types. We propose recommendation techniques, inference methods, and query selection strategies to assist a user charged with choosing a configuration that satisfies some (partially known) concept. Our model is able to simultaneously learn the concept definition and the types of the experts. We evaluate our model with simulations, showing that our Bayesian strategies are effective even in large concept spaces with many uninformative experts.


A General Perceptual Model for Eldercare Robots

AAAI Conferences

A general perceptual model is proposed for Eldercare Robot implementation that is comprised of audition functionality interconnected with a feedback-driven perceptual reasoning agent. Using multistage signal analysis to feed temporally tiered learning/recognition modules, concurrent access to sound event localization, classification, and context is realized. Patterns leading to the quantification of patient emotion/well being can be inferred using a perceptual reasoning agent. The system is prototyped using a Nao H-25 humanoid robot with an online processor running the Nao Qi SDK and the Max/MSP environment with the FTM, and GF libraries.


Robust Active Learning Using Crowdsourced Annotations for Activity Recognition

AAAI Conferences

Recognizing human activities from wearable sensor data is an important problem, particularly for health and eldercare applications. However, collecting sufficient labeled training data is challenging, especially since interpreting IMU traces is difficult for human annotators. Recently, crowdsourcing through services such as Amazon's Mechanical Turk has emerged as a promising alternative for annotating such data, with active learning serving as a natural method for affordably selecting an appropriate subset of instances to label. Unfortunately, since most active learning strategies are greedy methods that select the most uncertain sample, they are very sensitive to annotation errors (which corrupt a significant fraction of crowdsourced labels). This paper proposes methods for robust active learning under these conditions. Specifically, we make three contributions: 1) we obtain better initial labels by asking labelers to solve a related task; 2) we propose a new principled method for selecting instances in active learning that is more robust to annotation noise; 3) we estimate confidence scores for labels acquired from MTurk and ask workers to relabel samples that receive low scores under this metric. The proposed method is shown to significantly outperform existing techniques both under controlled noise conditions and in real active learning scenarios. The resulting method trains classifiers that are close in accuracy to those trained using ground-truth data.


Visual Search and Multirobot Collaboration Based on Hierarchical Planning

AAAI Conferences

Mobile robots are increasingly being used in the real-world due to the availability of high-fidelity sensors and sophisticated information processing algorithms. A key challenge to the widespread deployment of robots is the ability to accurately sense the environment and collaborate towards a common objective. Probabilistic sequential decision-making methods can be used to address this challenge because they encapsulate the partial observability and non-determinism of robot domains. However, such formulations soon become intractable for domains with complex state spaces that require real-time operation. Our prior work enabled a mobile robot to use hierarchical partially observable Markov decision processes (POMDPs) to automatically tailor visual sensing and information processing to the task at hand. This paper introduces adaptive observation functions and policy re-weighting in a three-layered POMDP hierarchy to enable reliable and efficient visual processing in dynamic domains. In addition, each robot merges its beliefs with those communicated by teammates, to enable a team of robots to collaborate robustly. All algorithms are evaluated in simulated domains and on physical robots tasked with locating target objects in indoor environments.


When Did You Start Doing that Thing that You Do? Interactive Activity Recognition and Prompting

AAAI Conferences

We present a model of interactive activity recognition and prompting for use in an assistive system for persons with cognitive disabilities. The system can determine the user’s state by interpreting sensor data and/or by explicitly querying the user, and can prompt the user to begin or end tasks. The objective of the system is to help the user maintain a daily schedule of activities while minimizing interruptions from questions or prompts. The model is built upon an option-based hierarchical POMDP. Options can be programmed and customized to specify complex routines for prompting or questioning. Novel aspects of the model include (1) the introduction of adaptive options, which employ a lightweight user model and are able to provide near-optimal performance with little exploration; (2) a restricted-inquiry dual-control algorithm that can appeal for help from the user when sensor data is ambiguous; and (3) a combined filtering / most likely-sequence algorithm for activities determining the beginning and ending time points of the user’s activities. Experiments show that each of these features contributes to the robustness of the model.