Learning Graphical Models
Commitment Semantics for Sequential Decision Making Under Reward Uncertainty
Durfee, Edmund H. (University of Michigan) | Singh, Satinder (University of Michigan)
A commitment represents an agent's intention to attempt to bring about some state of the world that is desired by some agent (possibly itself) in the future. Thus, by making a commitment, an agent is agreeing to make sequential decisions that it believes can cause the desired state to arise. In general, though, an agent's actions will have uncertain outcomes, and thus reaching the desired state cannot be guaranteed. For such sequential decision settings with uncertainty, therefore, commitments can only be probabilistic. We argue that standard notions of commitment are insufficient for probabilistic commitments, and propose a new semantics that judges commitment fulfillment not in terms of whether the agent achieved the desired state, but rather in terms of whether the agent made sequential decisions that in expectation would have achieved the desired state with (at least) the promised probability. We have devised various algorithms that operationalize our semantics, to capture problem contexts with probabilistic commitments arising because action outcomes are uncertain, as well as arising because an agent might realize over time that it does not want to fulfill the commitment.
Probabilistic Planning for Decentralized Multi-Robot Systems
Amato, Christopher (University of New Hampshire) | Konidaris, George (Duke University) | Omidshafiei, Shayegan (Massachusetts Institute of Technology) | Agha-mohammadi, Ali-akbar (Qualcomm Research) | How, Jonathan P. (Massachusetts Institute of Technology) | Kaelbling, Leslie P. (Massachusetts Institute of Technology)
Multi-robot systems are an exciting application domain for AI research and Dec-POMDPs, specifically. MacDec-POMDP methods can produce high-quality general solutions for realistic heterogeneous multi-robot coordination problems by automatically generating control and communication policies, given a model. In contrast to most existing multi-robot methods that are specialized to a particular problem class, our approach can synthesize policies that exploit any opportunities for coordination that are present in the problem, while balancing uncertainty, sensor information, and information about other agents.
Complexity of Self-Preserving, Team-Based Competition in Partially Observable Stochastic Games
Allen, Marty (University of Wisconsin-La Crosse)
Partially observable stochastic games (POSGs) are a robust and precise model for decentralized decision making under conditions of imperfect information, and extend popular Markov decision problem models. Complexity results for a wide range of such problems are known when agents work cooperatively to pursue common interests. When agents compete, things are less well understood. We show that under one understanding of rational competition, such problems are complete for the class NEXP^NP. This result holds for any such problem comprised of two competing teams of agents, where teams may be of any size whatsoever.
Planning Under Uncertainty with Weighted State Scenarios
Walraven, Erwin (Delft University of Technology) | Spaan, Matthijs T. J. (Delft University of Technology)
External factors are hard to model using a Markovian state in several real-world planning domains. Although planning can be difficult in such domains, it may be possible to exploit long-term dependencies between states of the environment during planning. We introduce weighted state scenarios to model long-term sequences of states, and we use a model based on a Partially Observable Markov Decision Process to reason about scenarios during planning. Experiments show that our model outperforms other methods for decision making in two real-world domains.
Robotic Social Feedback for Object Specification
Wu, Emily (Brown University) | Han, Yuxin (Rhode Island School of Design) | Whitney, David (Brown University) | Oberlin, John (Brown University) | MacGlashan, James (Brown University) | Tellex, Stefanie (Brown University)
Issuing and following instructions is a common task in many forms of both human-human and human-robot collaboration. With two human participants, the accuracy of instruction following increases if the collaborators can monitor the state of their partners and respond to them through conversation (Clark and Krych 2004), a process we call social feedback. Despite this benefit in human-human interaction, current human-robot collaboration systems process instructions in non-incremental batches, which can achieve good accuracy but does not allow for reactive feedback (Tellex et al. 2011; Matuszek et al. 2012; Tellex et al. 2012; Misra et al.2014). In this paper, we show that giving a robot the ability to ask the user questions results in responsive conversations and allows the robot to quickly determine the object that the user desires. This social feedback loop between person and robot allows a person to create an internal model for the robot’s mental state and adapt their own behavior to better inform the robot. To close the human-robot feedback loop, we employ a Partially Observable Markov Decision Process (POMDP) to produce a policy which will lead to the determination of the object in the shortest amount of time. To test our approach, we perform user studies to measure our robot’s ability to deliver common household items requested by the participant. We compare delivery speed and accuracy both with and without social feedback.
Modeling Situated Conversations for a Child-Care Robot Using Wearable Devices
On, Kyoung-Woon (Seoul National University) | Kim, Eun-Sol (Seoul National University) | Zhang, Byoung-Tak (Seoul National University)
How can robots fluently communicate with humans and have context-preserving conversation? It is the most momentous and crucial problem in robotics research, especially for service robots such as child-care robots. Here, we aim to develop a situated conversation system for child-care robots. The conversation system considers the current context between robots and children as well as the situation the child is in. The system consists of two parts. The first part tries to understand the context. This part uses the embedded sensors of robots to understand the context and wearable sensors of the child for getting information of the situation the child is in. The second part is to generate the situated conversation. In terms of the model, we designed a hierarchical Bayesian Network for the first part and a Hypernetwork model is used for the second part. We illustrate the application of communication with a child in a child-care service robots scenario. For this application, we collect wearable sensors’ data from the child and mother-child conversation data in daily life. Finally, we discuss our results and future works.
Temporal and Object Relations in Unsupervised Plan and Activity Recognition
Freedman, Richard G. (University of Massachusetts Amherst) | Jung, Hee-Tae (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst)
We consider ways to improve the performance of unsupervised plan and activity recognition techniques by considering temporal and object relations in addition to postural data. Temporal relationships can help recognize activities with cyclic structure and are often implicit because plans have degrees of ordering actions. Relations with objects can help disambiguate observed activities that otherwise share a user's posture and position. We develop and investigate graphical models that extend the popular latent Dirichlet allocation approach with temporal and object relations, examine the relative performance and runtime trade-offs using a standard dataset, and consider the cost/benefit trade-offs these extensions offer in the context of human-robot and humancomputer interaction.
Minecraft as an Experimental World for AI in Robotics
Aluru, Krishna Chaitanya (Brown University) | Tellex, Stefanie (Brown University) | Oberlin, John (Brown University) | MacGlashan, James (Brown University)
Performing experimental research on robotic platforms involves numerous practical complications, while studying collaborative interactions and efficiently collecting data from humans benefit from real time response. Roboticists can circumvent some complications by using simulators like Gazebo to test algorithms and building games like the Mars Escape game to collect data. Making use of existing resources for simulation and game creation requires the development of assets and algorithms along with the recruitment and training of users. We have created a Minecraft mod called BurlapCraft which enables the use of the reinforcement learning and planning library BURLAP to model and solve different tasks within Minecraft. BurlapCraft makes AI-HRI development easier in three core ways: the underlying Minecraft environment makes the construction of experiments simple for the developer and so allows the rapid prototyping of experimental setup; BURLAP contributes a wide variety of extensible algorithms for learning and planning, allowing easy iteration and development of task models and algorithms; and the familiarity and ubiquity of Minecraft trivializes the recruitment and training of users. To validate BurlapCraft as a platform for AI development, we demonstrate the execution of A*, BFS, RMax, language understanding, and learning language groundings from user demonstrations in five Minecraft "dungeons."
MCMCTS PCG 4 SMB: Monte Carlo Tree Search to Guide Platformer Level Generation
Summerville, Adam James (University of California, Santa Cruz) | Philip, Shweta (University of California, Santa Cruz) | Mateas, Michael (University of California, Santa Cruz)
Markov chains are an enticing option for machine learned generation of platformer levels, but offer poor control for designers and are likely to produce unplayable levels. In this paper we present a method for guiding Markov chain generation using Monte Carlo Tree Search that we call Markov Chain Monte Carlo Tree Search (MCMCTS). We demonstrate an example use for this technique by creating levels trained on a corpus of levels from Super Mario Bros. We then present a player modeling study that was run with the hopes of using the data to better inform the generation of levels in future work.