Undirected Networks
From an Agent Logic to an Agent Programming Language for Partially Observable Stochastic Domains
Rens, Gavin Brian (CSIR Meraka Institute)
PODTGolog [Rens, 2010] is a Golog dialect attempting Broadly speaking, my research concerns combining to deal with partially observable MDP (POMDP) logic of action and POMDP theory in a coherent, environments. PODTGolog has not been given a mathematical theoretically sound language for agent programming.
Towards Scalable MDP Algorithms
Kolobov, Andrey (University of Washington, Seattle)
The scalability of algorithms for solving Markov Decision Processes (MDPs) has been a limiting factor for MDPs as a modeling tool. This dissertation develops theoretical and empirical techniques for solving larger MDPs than was possible before, and aims to demonstrate the achieved progress by applying these new algorithms to a real-world problem.
Control of Robotic Systems for Safe Interaction with Human Operators
Ding, Hao (University of Kassel)
Human Robot Interaction (HRI) is an active field of integrating and embedding different techniques in artificial intelligence. This paper describes my research topic on: Control of Robotic Systems for Safe Interaction with Human Operators. It consists of online motion generation for robotic manipulators interacting with dynamic obstacles and humans using a moving horizon scheme, modeling and long term prediction of human motion using probabilistic models and reachability analysis, and development of an HRI demonstration platform.
Behaviour Recognition in Smart Homes
Chua, Sook-Ling (Massey University) | Marsland, Stephen (Massey University) | Guesgen, Hans W. (Massey University)
Behaviour recognition aims to infer the particular behaviours of the inhabitant in a smart home from a series of sensor readings from around the house. There are many reasons to recognise human behaviours; one being to monitor the elderly or cognitively impaired and detect potentially dangerous behaviours. We view the behaviour recognition problem as the task of mapping the sensory outputs to a sequence of recognised activities. This research focuses on the development of machine learning methods to find an approximation to the mapping between sensor outputs and behaviours. However, learning the mapping raises an important issue, which is that the training data is not necessarily annotated with exemplar behaviours of the inhabitant. This doctoral study takes several steps towards addressing the problem of finding an approximation to this mapping, beginning with separate investigations on current methods proposed in the literature, identifying useful sensory outputs for behaviour recognition, and concluding by proposing two directions: one using supervised learning on annotated sensory stream and one using unsupervised learning on unannotated ones.
A Flat Histogram Method for Computing the Density of States of Combinatorial Problems
Ermon, Stefano (Cornell University) | Gomes, Carla (Cornell University) | Selman, Bart (Cornell University)
Consider a combinatorial state space S, such as the set of all truth assignments to N Boolean variables. Given a partition of S, we consider the problem of estimating the size of all the subsets in which S is divided. This problem, also known as computing the density of states, is quite general and has many applications. For instance, if we consider a Boolean formula in CNF and we partition according to the number of violated constraints, computing the density of states is a generalization of both SAT, MAXSAT and model counting. We propose a novel Markov Chain Monte Carlo algorithm to compute the density of states of Boolean formulas that is based on a flat histogram approach. Our method represents a new approach to a variety of inference, learning, and counting problems. We demonstrate its practical effectiveness by showing that the method converges quickly to an accurate solution on a range of synthetic and real-world instances.
Exploiting Probabilistic Knowledge under Uncertain Sensing for Efficient Robot Behaviour
Hanheide, Marc (University of Birmingham) | Gretton, Charles (University of Birmingham) | Dearden, Richard W (University of Birmingham) | Hawes, Nick A (University of Birmingham) | Wyatt, Jeremy L (University of Birmingham) | Pronobis, Andrzej (KTH Stockholm) | Aydemir, Alper (KTH Stockholm) | Göbelbecker, Moritz (University of Freiburg) | Zender, Hendrik (DFKI Saarbrücken GmbH)
Robots must perform tasks efficiently and reliably while acting underuncertainty. One way to achieve efficiency is to give the robot common-sense knowledge about the structure of the world. Reliable robot behaviour can be achieved by modelling the uncertaintyin the world probabilistically. We present a robot system that combines these two approaches and demonstrate the improvements in efficiency and reliability that result. Our first contribution is a probabilistic relational model integrating common-sense knowledge about the world in general, with observations of a particular environment. Our second contribution is a continual planning system which is able to plan in the large problems posed by that model, by automatically switching between decision-theoretic and classical procedures. We evaluate our system on object search tasks in two different real-world indoor environments. By reasoning about the trade-offs between possible courses of action with different informational effects, and exploiting the cues and general structures of those environments, our robot is able to consistently demonstrate efficient and reliable goal-directed behaviour.
Enhancing Search Results with Semantic Annotation Using Augmented Browsing
Dai, Hong-Jie (Academia Sinica and National Tsing Hua University) | Tsai, Wei-Chi (Yuan Ze University) | Tsai, Richard Tzong-Han (Yuan Ze University) | Hsu, Wen-Lian (Academia Sinica and National Tsing Hua University)
In this paper, we describe how we integrated an artificial intelligence (AI) system into the PubMed search website using augmented browsing technology. Our system dynamically enriches the PubMed search results displayed in a user’s browser with semantic annotation provided by several natural language processing (NLP) subsystems, including a sentence splitter, a part-of-speech tagger, a named entity recognizer, a section categorizer and a gene normalizer (GN). After our system is installed, the PubMed search results page is modified on the fly to categorize sections and provide additional information on gene and gene products identified by our NLP subsystems. In addition, GN involves three main steps: candidate ID matching, false positive filtering and disambiguation, which are highly dependent on each other. We propose a joint model using a Markov logic network (MLN) to model the dependencies found in GN. The experimental results show that our joint model outperforms a baseline system that executes the three steps separately. The developed system is available at https://sites.google.com/site/pubmedannotationtool4ijcai/home.
Robust Online Optimization of Reward-Uncertain MDPs
Regan, Kevin (University of Toronto) | Boutilier, Craig (University of Toronto)
Imprecise-reward Markov decision processes (IRMDPs) are MDPs in which the reward function is only partially specified (e.g., by some elicitation process). Recent work using minimax regret to solve IRMDPs has shown, despite their theoretical intractability, how the set of policies that are nondominated w.r.t. reward uncertainty can be exploited to accelerate regret computation. However, the number of nondominated policies is generally so large as to undermine this leverage. In this paper, we show how the quality of the approximation can be improved online by pruning/adding nondominated policies during reward elicitation, while maintaining computational tractability. Drawing insights from the POMDP literature, we also develop a new anytime algorithm for constructing the set of nondominated policies with provable (anytime) error bounds. These bounds can be exploited to great effect in our online approximation scheme.
Eliciting Additive Reward Functions for Markov Decision Processes
Regan, Kevin (University of Toronto) | Boutilier, Craig (University of Toronto)
Specifying the reward function of a Markov decision process (MDP) can be demanding, requiring human assessment of the precise quality of, and tradeoffs among, various states and actions. However, reward functions often possess considerable structure which can be leveraged to streamline their specification. We develop new, decision-theoretically sound heuristics for eliciting rewards for factored MDPs whose reward functions exhibit additive independence. Since we can often find good policies without complete reward specification, we also develop new (exact and approximate) algorithms for robust optimization ofimprecise-reward MDPs with such additive reward. Our methods are evaluated in two domains: autonomic computing and assistive technology.
Scalable Multiagent Planning Using Probabilistic Inference
Kumar, Akshat (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst) | Toussaint, Marc (FU Berlin)
Multiagent planning has seen much progress with the development of formal models such as Dec-POMDPs. However, the complexity of these models—NEXP-Complete even for two agents—has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the Expectation-Maximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.