AITopics

Computer Go presents a challenging problem for machine learning agents. With the number of possible board states estimated to be larger than the number of hydrogen atoms in the universe, learning effective policies or board evaluation functions is extremely difficult. In this paper we describe Cortigo, a system that efficiently and autonomously learns useful generalizations for large state-space classification problems such as Go. Cortigo uses a hierarchical generative model loosely related to the human visual cortex to recognize Go board positions well enough to suggest promising next moves. We begin by briefly describing and providing motivation for research in the computer Go domain. We describe Cortigo’s ability to learn predictive models based on large subsets of the Go board and demonstrate how using Cortigo’s learned models as additive knowledge in a state-of-the-art computer Go player (Fuego) significantly improves its playing strength.

artificial intelligence, board position, machine learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Oklahoma (0.04)
North America > Canada > Alberta (0.04)

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Games > Go (1.00)

Zhang, Chongjie (University of Massachusetts Amherst) | Lesser, Victor (University of Massachusetts Amherst)

Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs

In many multi-agent applications such as distributed sensor nets, a network of agents act collaboratively under uncertainty and local interactions. Networked Distributed POMDP (ND-POMDP) provides a framework to model such cooperative multi-agent decision making. Existing work on ND-POMDPs has focused on offline techniques that require accurate models, which are usually costly to obtain in practice. This paper presents a model-free, scalable learning approach that synthesizes multi-agent reinforcement learning (MARL) and distributed constraint optimization (DCOP). By exploiting structured interaction in ND-POMDPs, our approach distributes the learning of the joint policy and employs DCOP techniques to coordinate distributed learning to ensure the global learning performance. Our approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability. Experimental results show that, with communication during learning and execution, our approach significantly outperforms the nearly-optimal non-communication policies computed offline.

machine learning, nd-pomdp, reinforcement learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Fast Newton-CG Method for Batch Learning of Conditional Random Fields

Tsuboi, Yuta (IBM Research - Tokyo) | Unno, Yuya (Preferred Infrastructure, Inc.) | Kashima, Hisashi (The University of Tokyo) | Okazaki, Naoaki (Tohoku University)

We propose a fast batch learning method for linear-chain Conditional Random Fields (CRFs) based on Newton-CG methods. Newton-CG methods are a variant of Newton method for high-dimensional problems. They only require the Hessian-vector products instead of the full Hessian matrices. To speed up Newton-CG methods for the CRF learning, we derive a novel dynamic programming procedure for the Hessian-vector products of the CRF objective function. The proposed procedure can reuse the byproducts of the time-consuming gradient computation for the Hessian-vector products to drastically reduce the total computation time of the Newton-CG methods. In experiments with tasks in natural language processing, the proposed method outperforms a conventional quasi-Newton method. Remarkably, the proposed method is competitive with online learning algorithms that are fast but unstable.

artificial intelligence, machine learning, natural language, (19 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
South America > Argentina (0.04)
Asia > Japan > Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
(2 more...)

Markov Logic Sets: Towards Lifted Information Retrieval Using PageRank and Label Propagation

Neumann, Marion (Fraunhofer IAIS) | Ahmadi, Babak (Fraunhofer IAIS) | Kersting, Kristian (Fraunhofer IAIS)

Inspired by “GoogleTM Sets” and Bayesian sets, we consider the problem of retrieving complex objects and relations among them, i.e., ground atoms from a logical concept, given a query consisting of a few atoms from that concept. We formulate this as a within-network relational learning problem using few labels only and describe an algorithm that ranks atoms using a score based on random walks with restart (RWR): the probability that a random surfer hits an atom starting from the query atoms. Specifically, we compute an initial ranking using personalized PageRank. Then, we find paths of atoms that are connected via their arguments, variablize the ground atoms in each path, in order to create features for the query. These features are used to re-personalize the original RWR and to finally compute the set completion, based on Label Propagation. Moreover, we exploit that RWR techniques can naturally be lifted and show that lifted inference for label propagation is possible. We evaluate our algorithm on a realworld relational dataset by finding completions of sets of objects describing the Roman city of Pompeii. We compare to Bayesian sets and show that our approach gives very reasonable set completions.

information retrieval, logic & formal reasoning, machine learning, (19 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Europe > Italy (0.04)
Europe > Germany (0.04)

Industry: Education (0.34)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Sparse Group Restricted Boltzmann Machines

Luo, Heng (Shanghai Jiao Tong University) | Shen, Ruimin (Shanghai Jiao Tong University) | Niu, Changyong (Zhengzhou University) | Ullrich, Carsten (Shanghai Jiao Tong University)

Since learning in Boltzmann machines is typically quite slow, there is a need to restrict connections within hidden layers. However, theresulting states of hidden units exhibit statistical dependencies. Based on this observation, we propose using l1/l2 regularization upon the activation probabilities of hidden units in restricted Boltzmann machines to capture the local dependencies among hidden units. This regularization not only encourages hidden units of many groups to be inactive given observed data but also makes hidden units within a group compete with each other for modeling observed data. Thus, the l1/l2 regularization on RBMs yields sparsity at both the group and the hidden unit levels. We call RBMs trained with the regularizer sparse group RBMs (SGRBMs). The proposed SGRBMs are appliedto model patches of natural images, handwritten digits and OCR English letters. Then to emphasize that SGRBMs can learn more discriminative features we applied SGRBMs to pretrain deep networks for classification tasks. Furthermore, we illustrate the regularizer can also be applied to deep Boltzmann machines, which lead to sparse group deep Boltzmann machines. When adapted to the MNIST data set, a two-layer sparse group Boltzmann machine achieves an error rate of 0.84%, which is, to our knowledge, the best published result on the permutation-invariant version of the MNIST task.

artificial intelligence, machine learning, sgrbm, (16 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Henan Province > Zhengzhou (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Mean Field Inference in Dependency Networks: An Empirical Study

Lowd, Daniel (University of Oregon) | Shamaei, Arash (University of Oregon)

Dependency networks are a compelling alternative to Bayesian networks for learning joint probability distributions from data and using them to compute probabilities. A dependency network consists of a set of conditional probability distributions, each representing the probability of a single variable given its Markov blanket. Running Gibbs sampling with these conditional distributions produces a joint distribution that can be used to answer queries, but suffers from the traditional slowness of sampling-based inference. In this paper, we observe that the mean field update equation can be applied to dependency networks, even though the conditional probability distributions may be inconsistent with each other. In experiments with learning and inference on 12 datasets, we demonstrate that mean field inference in dependency networks offers similar accuracy to Gibbs sampling but with orders of magnitude improvements in speed. Compared to Bayesian networks learned on the same data, dependency networks offer higher accuracy at greater amounts of evidence. Furthermore, mean field inference is consistently more accurate in dependency networks than in Bayesian networks learned on the same data.

artificial intelligence, dependency network, machine learning, (20 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon > Lane County > Eugene (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

OASIS: Online Active Semi-Supervised Learning

Goldberg, Andrew B. (Arcode Corporation) | Zhu, Xiaojin (University of Wisconsin-Madison) | Furger, Alex (University of Wisconsin-Madison) | Xu, Jun-Ming (University of Wisconsin-Madison)

We consider a learning setting of importance to large scale machine learning: potentially unlimited data arrives sequentially, but only a small fraction of it is labeled. The learner cannot store the data; it should learn from both labeled and unlabeled data, and it may also request labels for some of the unlabeled items. This setting is frequently encountered in real-world applications and has the characteristics of online, semi-supervised, and active learning. Yet previous learning models fail to consider these characteristics jointly. We present OASIS, a Bayesian model for this learning setting. The main contributions of the model include the novel integration of a semi-supervised likelihood function, a sequential Monte Carlo scheme for efficient online Bayesian updating, and a posterior-reduction criterion for active learning. Encouraging results on both synthetic and real-world optical character recognition data demonstrate the synergy of these characteristics in OASIS.

artificial intelligence, machine learning, unlabeled data, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)

Unsupervised Learning of Human Behaviours

Chua, Sook-Ling (Massey University) | Marsland, Stephen (Massey University) | Guesgen, Hans W. (Massey University)

Behaviour recognition is the process of inferring the behaviour of an individual from a series of observations acquired from sensors such as in a smart home. The majority of existing behaviour recognition systems are based on supervised learning algorithms, which means that training them requires a preprocessed, annotated dataset. Unfortunately, annotating a dataset is a rather tedious process and one that is prone to error. In this paper we suggest a way to identify structure in the data based on text compression and the edit distance between words, without any prior labelling. We demonstrate that by using this method we can automatically identify patterns and segment the data into patterns that correspond to human behaviours. To evaluate the effectiveness of our proposed method, we use a dataset from a smart home and compare the labels produced by our approach with the labels assigned by a human to the activities in the dataset. We find that the results are promising and show significant improvement in the recognition accuracy over Self-Organising Maps (SOMs).

artificial intelligence, edit distance, machine learning, (18 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Oceania > New Zealand > North Island > Manawatū-Whanganui > Palmerston North (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Germany > Berlin (0.04)

Industry: Information Technology > Smart Houses & Appliances (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.65)

Canini, Kevin Robert (University of California, Berkeley) | Griffiths, Thomas L. (University of California, Berkeley)

A Nonparametric Bayesian Model of Multi-Level Category Learning

Categories are often organized into hierarchical taxonomies, that is, tree structures where each node represents a labeled category, and a node's parent and children are, respectively, the category's supertype and subtypes. A natural question is whether it is possible to reconstruct category taxonomies in cases where we are not given explicit information about how categories are related to each other, but only a sample of observations of the members of each category. In this paper, we introduce a nonparametric Bayesian model of multi-level category learning, an extension of the hierarchical Dirichlet process (HDP) that we call the tree-HDP. We demonstrate the ability of the tree-HDP to reconstruct simulated datasets of artificial taxonomies, and show that it produces similar performance to human learners on a taxonomy inference task.

artificial intelligence, category, machine learning, (17 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Boots, Byron (Carnegie Mellon University) | Gordon, Geoffrey J. (Carnegie Mellon University)

An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems

Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems — for example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However, they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental Singular Value Decomposition (SVD) and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on an inertial measurement prediction task and a high-bandwidth video mapping task and we illustrate desirable behaviors such as "closing the loop," where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place.

algorithm, artificial intelligence, machine learning, (16 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)