AITopics

Knowledge transfer has been suggested as a useful approach for solving large Markov Decision Processes. The main idea is to compute a decision-making policy in one environment and use it in a different environment, provided the two are ”close enough”. In this paper, we use bisimulation-style metrics (Ferns et al., 2004) to guide knowledge transfer. We propose algorithms that decide what actions to transfer from the policy computed on a small MDP task to a large task, given the bisimulation distance between states in the two tasks. We demonstrate the inherent ”pessimism” of bisimulation metrics and present variants of this metric aimed to overcome this pessimism, leading to improved action transfer. We also show that using this approach for transferring temporally extended actions (Sutton et al., 1999) is more successful than using it exclusively with primitive actions. We present theoretical guarantees on the quality of the transferred policy, as well as promising empirical results.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: North America > Canada > Quebec > Montreal (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Amato, Christopher (University of Massachusetts, Amherst) | Bonet, Blai (Universidad Simón Bolívar) | Zilberstein, Shlomo (University of Massachusetts, Amherst)

Finite-State Controllers Based on Mealy Machines for Centralized and Decentralized POMDPs

Existing controller-based approaches for centralized and decentralized POMDPs are based on automata with output known as Moore machines. In this paper, we show that several advantages can be gained by utilizing another type of automata, the Mealy machine. Mealy machines are more powerful than Moore machines, provide a richer structure that can be exploited by solution methods, and can be easily incorporated into current controller-based approaches. To demonstrate this, we adapted some existing controller-based algorithms to use Mealy machines and obtained results on a set of benchmark domains. The Mealy-based approach always outperformed the Moore-based approach and often outperformed the state-of-the-art algorithms for both centralized and decentralized POMDPs. These findings provide fresh and general insights for the improvement of existing algorithms and the development of new ones.

artificial intelligence, machine learning, mealy machine, (18 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
(2 more...)

Industry: Government > Regional Government > North America Government > United States Government (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Yu, Xiaofeng (The Chinese University of Hong Kong) | Lam, Wai (The Chinese University of Hong Kong)

Bidirectional Integration of Pipeline Models

Traditional information extraction systems adopt pipeline strategies, which are highly ineffective and suffer from several problems such as error propagation. Typically, pipeline models fail to produce highly-accurate final output. On the other hand, there has been growing interest in integrated or joint models which explore mutual benefits and perform multiple subtasks simultaneously to avoid problems caused by pipeline models. However, building such systems usually increases computational complexity and requires considerable engineering. This paper presents a general, strongly-coupled, and bidirectional architecture based on discriminatively trained factor graphs for information extraction. First we introduce joint factors connecting variables of relevant subtasks to capture dependencies and interactions between them. We then propose a strong bidirectional MCMC sampling inference algorithm which allows information to flow in both directions to find the approximate MAP solution for all subtasks. Extensive experiments on entity identification and relation extraction using real-world data illustrate the promise of our approach.

artificial intelligence, machine learning, natural language, (20 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Elkind, Edith (Nanyang Technological University) | Faliszewski, Piotr (AGH Univesity of Science and Technology) | Slinko, Arkadii (Univeristy of Auckland)

Good Rationalizations of Voting Rules

We explore the relationship between two approaches to rationalizing voting rules: the maximum likelihood estimation (MLE) framework originally suggested by Condorcet and recently studied by Conitzer, Rognlie, and Xia, and the distance rationalizability (DR) framework of Elkind, Faliszewski, and Slinko. The former views voting as an attempt to reconstruct the correct ordering of the candidates given noisy estimates (i.e., votes), while the latter explains voting as search for the nearest consensus outcome. We provide conditions under which an MLE interpretation of a voting rule coincides with its DR interpretation, and classify a number of classic voting rules, such as Kemeny, Plurality, Borda and Single Transferable Vote (STV), according to how well they fit each of these frameworks. The classification we obtain is more precise than the ones that result from using MLE or DR alone: indeed, we show that the MLE approach can be used to guide our search for a more refined notion of distance rationalizability and vice versa.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Zhong, Guoqiang (Chinese Academy of Sciences) | Li, Wu-Jun (The Hong Kong University of Science and Technology) | Yeung, Dit-Yan (The Hong Kong University of Science and Technology) | Hou, Xinwen (Chinese Academy of Sciences) | Liu, Cheng-Lin (Chinese Academy of Sciences)

Gaussian Process Latent Random Field

In this paper, we propose a novel supervised extension of GPLVM, called Gaussian process latent random field (GPLRF), by enforcing the latent variables to be a Gaussian Markov random field with respect to a graph constructed from the supervisory information.

artificial intelligence, gplrf, machine learning, (18 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)

Bayesian Policy Search for Multi-Agent Role Discovery

Wilson, Aaron (Oregon State University) | Fern, Alan (Oregon State University) | Tadepalli, Prasad (Oregon State University)

Bayesian inference is an appealing approach for leveraging prior knowledge in reinforcement learning (RL). In this paper we describe an algorithm for discovering different classes of roles for agents via Bayesian inference. In particular, we develop a Bayesian policy search approach for Multi-Agent RL (MARL), which is model-free and allows for priors on policy parameters. We present a novel optimization algorithm based on hybrid MCMC, which leverages both the prior and gradient information estimated from trajectories. Our experiments in a complex real-time strategy game demonstrate the effective discovery of roles from supervised trajectories, the use of discovered roles for successful transfer to similar tasks, and the discovery of roles through reinforcement learning.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Integrating Sample-Based Planning and Model-Based Reinforcement Learning

Walsh, Thomas J. (Rutgers University) | Goschin, Sergiu (Rutgers University) | Littman, Michael L. (Rutgers University)

Recent advancements in model-based reinforcement learning have shown that the dynamics of many structured domains (e.g. DBNs) can be learned with tractable sample complexity, despite their exponentially large state spaces. Unfortunately, these algorithms all require access to a planner that computes a near optimal policy, and while many traditional MDP algorithms make this guarantee, their computation time grows with the number of states. We show how to replace these over-matched planners with a class of sample-based planners — whose computation time is independent of the number of states — without sacrificing the sample-efficiency guarantees of the overall learning algorithms. To do so, we define sufficient criteria for a sample-based planner to be used in such a learning system and analyze two popular sample-based approaches from the literature. We also introduce our own sample-based planner, which combines the strategies from these algorithms and still meets the criteria for integration into our learning system. In doing so, we define the first complete RL solution for compactly represented (exponentially sized) state spaces with efficiently learnable dynamics that is both sample efficient and whose computation time does not grow rapidly with the number of states.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)

Veness, Joel (University of New South Wales and NICTA) | Ng, Kee Siong (Medicare Australia and Australian National University) | Hutter, Marcus (Australian National University and NICTA) | Silver, David (University College London)

Reinforcement Learning via AIXI Approximation

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a Monte Carlo Tree Search algorithm along with an agent-specific extension of the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a number of stochastic, unknown, and partially observable domains.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: Oceania > Australia (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Constrained Coclustering for Textual Documents

In this paper, we present a constrained co-clustering approach for clustering textual documents. Our approach combines the benefits of information-theoretic co-clustering and constrained clustering. We use a two-sided hidden Markov random field (HMRF) to model both the document and word constraints. We also develop an alternating expectation maximization (EM) algorithm to optimize the constrained co-clustering model. We have conducted two sets of experiments on a benchmark data set: (1) using human-provided category labels to derive document and word constraints for semi-supervised document clustering, and (2) using automatically extracted named entities to derive document constraints for unsupervised document clustering. Compared to several representative constrained clustering and co-clustering approaches, our approach is shown to be more effective for high-dimensional, sparse text data.

constraint, machine learning, natural language, (20 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)

Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures

Porteous, Ian (University of California Irvine) | Asuncion, Arthur (University of California Irvine) | Welling, Max (University of California Irvine)

Matrix factorization is a fundamental technique in machine learning that is applicable to collaborative filtering, information retrieval and many other areas. In collaborative filtering and many other tasks, the objective is to fill in missing elements of a sparse data matrix. One of the biggest challenges in this case is filling in a column or row of the matrix with very few observations. In this paper we introduce a Bayesian matrix factorization model that performs regression against side information known about the data in addition to the observations. The side information helps by adding observed entries to the factored matrices. We also introduce a nonparametric mixture model for the prior of the rows and columns of the factored matrices that gives a different regularization for each latent class. Besides providing a richer prior, the posterior distribution of mixture assignments reveals the latent classes. Using Gibbs sampling for inference, we apply our model to the Netflix Prize problem of predicting movie ratings given an incomplete user-movie ratings matrix. Incorporating rating information with gathered metadata information, our Bayesian approach outperforms other matrix factorization techniques even when using fewer dimensions.

artificial intelligence, information, machine learning, (16 more...)

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Orange County > Irvine (0.14)
South America > Paraguay > Asunción > Asunción (0.05)
North America > United States > New York > New York County > New York City (0.05)
(2 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Media > Television (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)