Goto

Collaborating Authors

 Undirected Networks


A sticky HDP-HMM with application to speaker diarization

arXiv.org Machine Learning

We consider the problem of speaker diarization, the problem of segmenting an audio recording of a meeting into temporal segments corresponding to individual speakers. The problem is rendered particularly difficult by the fact that we are not allowed to assume knowledge of the number of people participating in the meeting. To address this problem, we take a Bayesian nonparametric approach to speaker diarization that builds on the hierarchical Dirichlet process hidden Markov model (HDP-HMM) of Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566--1581]. Although the basic HDP-HMM tends to over-segment the audio data---creating redundant states and rapidly switching among them---we describe an augmented HDP-HMM that provides effective control over the switching rate. We also show that this augmentation makes it possible to treat emission distributions nonparametrically. To scale the resulting architecture to realistic diarization problems, we develop a sampling algorithm that employs a truncated approximation of the Dirichlet process to jointly resample the full state sequence, greatly improving mixing rates. Working with a benchmark NIST data set, we show that our Bayesian nonparametric architecture yields state-of-the-art speaker diarization results.


Interactive First-Order Probabilistic Logic

AAAI Conferences

Being able to compactly represent large state spaces is crucial in solving a vast majority of practical stochastic planning problems. This requirement is even more stringent in the context of multi-agent systems, in which the world to be modeled also includes the mental state of other agents. This leads to a hierarchy of beliefs that results in a continuous, unbounded set of possible interactive states, as in the case of Interactive POMDPs. In this paper, we describe a novel representation for interactive belief hierarchies that combines first-order logic and probability. The semantics of this new formalism is based on recursively partitioning the belief space at each level of the hierarchy; in particular, the partitions of the belief simplex at one level constitute the vertices of the simplex at the next higher level. Since in general a set of probabilistic statements only partially specifies a probability distribution over the space of interest, we adopt the maximum entropy principle in order to convert it to a full specification.


Markov Games of Incomplete Information for Multi-Agent Reinforcement Learning

AAAI Conferences

Partially observable stochastic games (POSGs) are an attractive model for many multi-agent domains, but are computationally extremely difficult to solve. We present a new model, Markov games of incomplete information (MGII) which imposes a mild restriction on POSGs while overcoming their primary computational bottleneck. Finally we show how to convert a MGII into a continuous but bounded fully observable stochastic game. MGIIs represents the most general tractable model for multi-agent reinforcement learning to date.


Human Intelligence Needs Artificial Intelligence

AAAI Conferences

Crowdsourcing platforms, such as Amazon Mechanical Turk, have enabled the construction of scalable applications for tasks ranging from product categorization and photo tagging to audio transcription and translation. These vertical applications are typically realized with complex, self-managing workflows that guarantee quality results. But constructing such workflows is challenging, with a huge number of alternative decisions for the designer to consider. We argue the thesis that โ€œArtificial intelligence methods can greatly simplify the process of creating and managing complex crowdsourced workflows.โ€ We present the design of CLOWDER, which uses machine learning to continually refine models of worker performance and task difficulty. Using these models, CLOWDER uses decision-theoretic optimization to 1) choose between alternative workflows, 2) optimize parameters for a workflow, 3) create personalized interfaces for individual workers, and 4) dynamically control the workflow. Preliminary experience suggests that these optimized workflows are significantly more economical (and return higher quality output) than those generated by humans.


Human Activity Detection from RGBD Images

AAAI Conferences

Being able to detect and recognize human activities is important for making personal assistant robots useful in performing assistive tasks. The challenge is to develop a system that is low-cost, reliable in unstructured home settings, and also straightforward to use. In this paper, we use a RGBD sensor (Microsoft Kinect) as the input sensor, and present learning algorithms to infer the activities. Our algorithm is based on a hierarchical maximum entropy Markov model (MEMM). It considers a person's activity as composed of a set of sub-activities, and infers the two-layered graph structure using a dynamic programming approach. We test our algorithm on detecting and recognizing twelve different activities performed by four people in different environments, such as a kitchen, a living room, an office, etc., and achieve an average performance of 84.3% when the person was seen before in the training set (and 64.2% when the person was not seen before).


InfoMax Control for Acoustic Exploration of Objects by a Mobile Robot

AAAI Conferences

Recently, information gain has been proposed as a candidate intrinsic motivation for lifelong learning agents that may not always have a specific task. ย In the InfoMax control framework, reinforcement learning is used to find a control policy for a POMDP in which movement and sensing actions are selected to reduce Shannon entropy as quickly as possible. In this study, we implement InfoMax control on a robot which can move between objects and perform sound-producing manipulations on them. ย We formulate a novel latent variable mixture model for acoustic similarities and learn InfoMax polices that allow the robot to rapidly reduce uncertainty about the categories of the objects in a room. We find that InfoMax with our improved acoustic model leads to policies which lead to high classification accuracy. ย Interestingly, we also find that with an insufficient model, the InfoMax policy eventually learns to "bury its head in the sand" to avoid getting additional evidence that might increase uncertainty. ย We discuss the implications of this finding for InfoMax as a principle of intrinsic motivation in lifelong learning agents.


Visual Search and Multirobot Collaboration Based on Hierarchical Planning

AAAI Conferences

Mobile robots are increasingly being used in the real-world due to the availability of high-fidelity sensors and sophisticated information processing algorithms. A key challenge to the widespread deployment of robots is the ability to accurately sense the environment and collaborate towards a common objective. Probabilistic sequential decision-making methods can be used to address this challenge because they encapsulate the partial observability and non-determinism of robot domains. However, such formulations soon become intractable for domains with complex state spaces that require real-time operation. Our prior work enabled a mobile robot to use hierarchical partially observable Markov decision processes (POMDPs) to automatically tailor visual sensing and information processing to the task at hand. This paper introduces adaptive observation functions and policy re-weighting in a three-layered POMDP hierarchy to enable reliable and efficient visual processing in dynamic domains. In addition, each robot merges its beliefs with those communicated by teammates, to enable a team of robots to collaborate robustly. All algorithms are evaluated in simulated domains and on physical robots tasked with locating target objects in indoor environments.


When Did You Start Doing that Thing that You Do? Interactive Activity Recognition and Prompting

AAAI Conferences

We present a model of interactive activity recognition and prompting for use in an assistive system for persons with cognitive disabilities. The system can determine the userโ€™s state by interpreting sensor data and/or by explicitly querying the user, and can prompt the user to begin or end tasks. The objective of the system is to help the user maintain a daily schedule of activities while minimizing interruptions from questions or prompts. The model is built upon an option-based hierarchical POMDP. Options can be programmed and customized to specify complex routines for prompting or questioning. Novel aspects of the model include (1) the introduction of adaptive options, which employ a lightweight user model and are able to provide near-optimal performance with little exploration; (2) a restricted-inquiry dual-control algorithm that can appeal for help from the user when sensor data is ambiguous; and (3) a combined filtering / most likely-sequence algorithm for activities determining the beginning and ending time points of the userโ€™s activities. Experiments show that each of these features contributes to the robustness of the model.


Mobile, Collaborative, Context-Aware Systems

AAAI Conferences

We describe work on representing and using a rich notion ofcontext that goes beyond current networking applications focusingmostly on location. Our context model includes locationand surroundings, the presence of people and devices,inferred activities and the roles people fill in them. A keyelement of our work is the use of collaborative informationsharing where devices share and integrate knowledge abouttheir context. This introduces a requirement that users canset appropriate levels of privacy to protect the personal informationbeing collected and the inferences that can be drawnfrom it. We use Semantic Web technologies to model contextand to specify high-level, declarative policies specifying informationsharing constraints. The policies involve attributesof the subject (i.e., information recipient), target (i.e., the information)and their dynamic context (e.g., are the parties copresent).We discuss our ongoing work on context representationand inference and present a model for protecting andcontrolling the sharing of private data in context-aware mobileapplications.


Policy Gradient Planning for Environmental Decision Making with Existing Simulators

AAAI Conferences

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action spaces, spatial correlation between actions, uncertainty and complex utility models. We present an approach for modeling these planning problems as factored Markov decision processes. The reward model can contain local and global components as well as spatial constraints between locations. The transition dynamics can be provided by existing simulators developed by domain experts. We propose a landscape policy defined as the equilibrium distribution of a Markov chain built from many locally-parameterized policies. This policy is optimized using a policy gradient algorithm. Experiments using a forestry simulator demonstrate the algorithm's ability to devise policies for sustainable harvest planning of a forest.