Undirected Networks
Structured Parameter Elicitation
Ko, Li Ling (National University of Singapore) | Hsu, David (National University of Singapore) | Lee, Wee Sun (National University of Singapore) | Ong, Sylvie C. W. (National University of Singapore)
The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on the parameter values while progressing towards its goal. We call this problem parameter elicitation. Partially observable Markov decision processes (POMDPs) provide a principled framework for such uncertainty planning tasks, but they suffer from high computational complexity. However, POMDPs for parameter elicitation often possess special structural properties, specifically, factorization and symmetry. This work identifies these properties and exploits them for efficient solution through a factored belief representation. The experimental results show that our new POMDP solvers outperform SARSOP and MOMDP, two of the fastest general-purpose POMDP solvers available, and can handle significantly larger problems.
PUMA: Planning Under Uncertainty with Macro-Actions
He, Ruijie (Massachusetts Institute of Technology) | Brunskill, Emma (University of California, Berkeley) | Roy, Nicholas (Massachusetts Institute of Technology)
Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan a different potential action for each future observation can be prohibitively expensive when planning many steps ahead. An efficient solution for planning far into the future in fully observable domains is to use temporally-extended sequences of actions, or "macro-actions." In this paper, we present a POMDP algorithm for planning under uncertainty with macro-actions (PUMA) that automatically constructs and evaluates open-loop macro-actions within forward-search planning, where the planner branches on observations only at the end of each macro-action. Additionally, we show how to incrementally refine the plan over time, resulting in an anytime algorithm that provably converges to an epsilon-optimal policy. In experiments on several large POMDP problems which require a long horizon lookahead, PUMA outperforms existing state-of-the art solvers.
An Analytic Characterization of Model Minimization in Factored Markov Decision Processes
Guo, Wenyuan (National University of Singapore) | Leong, Tze-Yun (National University of Singapore)
Model minimization in Factored Markov Decision Processes (FMDPs) is concerned with finding the most compact partition of the state space such that all states in the same block are action-equivalent. This is an important problem because it can potentially transform a large FMDP into an equivalent but much smaller one, whose solution can be readily used to solve the original model. Previous model minimization algorithms are iterative in nature, making opaque the relationship between the input model and the output partition. We demonstrate that given a set of well-defined concepts and operations on partitions, we can express the model minimization problem in an analytic fashion. The theoretical results developed can be readily applied to solving problems such as estimating the size of the minimum partition, refining existing algorithms, and so on.
Using Bisimulation for Policy Transfer in MDPs
Castro, Pablo Samuel (McGill University) | Precup, Doina (McGill University)
Knowledge transfer has been suggested as a useful approach for solving large Markov Decision Processes. The main idea is to compute a decision-making policy in one environment and use it in a different environment, provided the two are โclose enoughโ. In this paper, we use bisimulation-style metrics (Ferns et al., 2004) to guide knowledge transfer. We propose algorithms that decide what actions to transfer from the policy computed on a small MDP task to a large task, given the bisimulation distance between states in the two tasks. We demonstrate the inherent โpessimismโ of bisimulation metrics and present variants of this metric aimed to overcome this pessimism, leading to improved action transfer. We also show that using this approach for transferring temporally extended actions (Sutton et al., 1999) is more successful than using it exclusively with primitive actions. We present theoretical guarantees on the quality of the transferred policy, as well as promising empirical results.
Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs
Amato, Christopher (University of Massachusetts, Amherst) | Bonet, Blai (Universidad Simรณn Bolรญvar) | Zilberstein, Shlomo (University of Massachusetts, Amherst)
Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.
Bidirectional Integration of Pipeline Models
Yu, Xiaofeng (The Chinese University of Hong Kong) | Lam, Wai (The Chinese University of Hong Kong)
Traditional information extraction systems adopt pipeline strategies, which are highly ineffective and suffer from several problems such as error propagation. Typically, pipeline models fail to produce highly-accurate final output. On the other hand, there has been growing interest in integrated or joint models which explore mutual benefits and perform multiple subtasks simultaneously to avoid problems caused by pipeline models. However, building such systems usually increases computational complexity and requires considerable engineering. This paper presents a general, strongly-coupled, and bidirectional architecture based on discriminatively trained factor graphs for information extraction. First we introduce joint factors connecting variables of relevant subtasks to capture dependencies and interactions between them. We then propose a strong bidirectional MCMC sampling inference algorithm which allows information to flow in both directions to find the approximate MAP solution for all subtasks. Extensive experiments on entity identification and relation extraction using real-world data illustrate the promise of our approach.
Constrained Coclustering for Textual Documents
Song, Yangqiu (IBM Research - China) | Pan, Shimei (IBM T. J. Watson Research Center) | Liu, Shixia (IBM Research - China) | Wei, Furu (IBM Research - China) | Zhou, Michelle X. (IBM Research - Almaden Center) | Qian, Weihong (IBM Research - China)
In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrained co-clustering model. We have conducted two sets of experiments on a benchmark data set: (1) using human-provided category labels to derive document and word constraints for semi-supervised document clustering, and (2) using automatically extracted named entities to derive document constraints for unsupervised document clustering. Compared to several representative constrained clustering and co-clustering approaches, our approach is shown to be more effective for high-dimensional, sparse text data.
Structure Learning for Markov Logic Networks with Many Descriptive Attributes
Khosravi, Hassan (Simon Fraser University) | Schulte, Oliver (Simon Fraser University) | Man, Tong (Simon Fraser University) | Xu, Xiaoyuan (Simon Fraser University) | Bina, Bahareh (Simon Fraser University)
Many machine learning applications that involve relational databases incorporate first-order logic and probability. Markov Logic Networks (MLNs) are a prominent statistical relational model that consist of weighted first order clauses. Many of the current state-of-the-art algorithms for learning MLNs have focused on relatively small datasets with few descriptive attributes, where predicates are mostly binary and the main task is usually prediction of links between entities. This paper addresses what is in a sense a complementary problem: learning the structure of an MLN that models the distribution of discrete descriptive attributes on medium to large datasets, given the links between entities in a relational database. Descriptive attributes are usually nonbinary and can be very informative, but they increase the search space of possible candidate clauses. We present an efficient new algorithm for learning a directed relational model (parametrized Bayes net), which produces an MLN structure via a standard moralization procedure for converting directed models to undirected models. Learning MLN structure in this way is 200-1000 times faster and scores substantially higher in predictive accuracy than benchmark algorithms on three relational databases.
Latent Variable Model for Learning in Pairwise Markov Networks
Amizadeh, Saeed (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)
Pairwise Markov Networks (PMN) are an important class of Markov networks which, due to their simplicity, are widely used in many applications such as image analysis, bioinformatics, sensor networks, etc. However, learning of Markov networks from data is a challenging task; there are many possible structures one must consider and each of these structures comes with its own parameters making it easy to overfit the model with limited data. To deal with the problem, recent learning methods build upon the L1 regularization to express the bias towards sparse network structures. In this paper, we propose a new and more flexible framework that let us bias the structure, that can, for example, encode the preference to networks with certain local substructures which as a whole exhibit some special global structure. We experiment with and show the benefit of our framework on two types of problems: learning of modular networks and learning of traffic networks models.