AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

A Monte-Carlo AIXI Approximation

Veness, J., Ng, K.S., Hutter, M., Uther, W., Silver, D.

Journal of Artificial Intelligence ResearchJan-24-2011

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.

artificial intelligence, machine learning, reinforcement learning, (22 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3125

AI Access Foundation

10685

Journal of Artificial Intelligence Research

Country:

Oceania > Australia (0.28)
Europe (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.45)

Industry:

Leisure & Entertainment > Games (1.00)
Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
(3 more...)

Add feedback

Inference of global clusters from locally distributed data

Nguyen, XuanLong

arXiv.org Machine LearningJan-21-2011

We consider the problem of analyzing the heterogeneity of clustering distributions for multiple groups of observed data, each of which is indexed by a covariate value, and inferring global clusters arising from observations aggregated over the covariate domain. We propose a novel Bayesian nonparametric method reposing on the formalism of spatial modeling and a nested hierarchy of Dirichlet processes. We provide an analysis of the model properties, relating and contrasting the notions of local and global clusters. We also provide an efficient inference algorithm, and demonstrate the utility of our method in several data examples, including the problem of object tracking and a global clustering analysis of functional data where the functional identity information is not available.

artificial intelligence, local cluster, machine learning, (19 more...)

arXiv.org Machine Learning

1001.0597

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Michigan (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Reports of the AAAI 2010 Conference Workshops

Aha, David W. (Naval Research Laboratory) | Boddy, Mark (Adventium Labs) | Bulitko, Vadim (University of Alberta) | Garcez, Artur S. d' (City University London) | Avila (University of Georgia) | Doshi, Prashant (TZI, Bremen University) | Edelkamp, Stefan (University of Edinburgh) | Geib, Christopher (University of Illinois, Chicago) | Gmytrasiewicz, Piotr (Smart Information Flow Technologies) | Goldman, Robert P. (Wright State University) | Hitzler, Pascal (Georgia Institute of Technology) | Isbell, Charles (University of Maryland, College Park) | Josyula, Darsana (Massachusetts Institute of Technology) | Kaelbling, Leslie Pack (University of Bonn) | Kersting, Kristian (Georgia Institute of Technology) | Kunda, Maithilee (Universidade Federal do Rio Grande do Sul (UFRGS)) | Lamb, Luis C. (Willow Garage) | Marthi, Bhaskara (Georgia Institute of Technology) | McGreggor, Keith (EML Research gGmbH) | Nastase, Vivi (University College Cork) | Provan, Gregory (University of North Carolina, Charlotte) | Raja, Anita (Georgia Institute of Technology) | Ram, Ashwin (Georgia Institute of Technology) | Riedl, Mark (University of California, Berkeley) | Russell, Stuart (Cornell University) | Sabharwal, Ashish (University of Freiburg) | Smaus, Jan-Georg (University of Central Florida) | Sukthankar, Gita (Maastricht University) | Tuyls, Karl (University of New South Wales) | Meyden, Ron van der (Google, Inc.) | Halevy, Alon (University of Maryland) | Mihalkova, Lilyana (University of Wisconsin) | Natarajan, Sriraam

AI MagazineJan-13-2011

The AAAI-10 Workshop program was held Sunday and Monday, July 11–12, 2010 at the Westin Peachtree Plaza in Atlanta, Georgia. The AAAI-10 workshop program included 13 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were AI and Fun, Bridging the Gap between Task and Motion Planning, Collaboratively-Built Knowledge Sources and Artificial Intelligence, Goal-Directed Autonomy, Intelligent Security, Interactive Decision Theory and Game Theory, Metacognition for Robust Social Systems, Model Checking and Artificial Intelligence, Neural-Symbolic Learning and Reasoning, Plan, Activity, and Intent Recognition, Statistical Relational AI, Visual Representations and Reasoning, and Abstraction, Reformulation, and Approximation. This article presents short summaries of those events.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

AI Magazine

Country:

Europe > Germany (0.93)
North America > United States > California (0.68)
North America > United States > Georgia > Fulton County > Atlanta (0.25)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Information Technology > Security & Privacy (0.93)
Leisure & Entertainment > Games > Computer Games (0.68)
Education > Educational Setting > Higher Education (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Add feedback

Learning Hidden Markov Models using Non-Negative Matrix Factorization

Cybenko, George, Crespi, Valentino

arXiv.org Artificial IntelligenceJan-7-2011

The Baum-Welsh algorithm together with its derivatives and variations has been the main technique for learning Hidden Markov Models (HMM) from observational data. We present an HMM learning algorithm based on the non-negative matrix factorization (NMF) of higher order Markovian statistics that is structurally different from the Baum-Welsh and its associated approaches. The described algorithm supports estimation of the number of recurrent states of an HMM and iterates the non-negative matrix factorization (NMF) algorithm to improve the learned HMM parameters. Numerical examples are provided as well.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

0809.4086

Country:

Europe (0.93)
North America > United States > California (0.28)
North America > United States > Massachusetts (0.28)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Non-Deterministic Policies in Markovian Decision Processes

Milani Fard, M., Pineau, J.

Journal of Artificial Intelligence ResearchJan-5-2011

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional methods in reinforcement learning have proved to be useful in problems concerning sequential decision-making, they cannot be applied in their current form to decision support systems, such as those in medical domains, as they suggest policies that are often highly prescriptive and leave little room for the user's input. Without the ability to provide flexible guidelines, it is unlikely that these methods can gain ground with users of such systems. This paper introduces the new concept of non-deterministic policies to allow more flexibility in the user's decision-making process, while constraining decisions to remain near optimal solutions. We provide two algorithms to compute non-deterministic policies in discrete domains. We study the output and running time of these method on a set of synthetic and real-world problems. In an experiment with human subjects, we show that humans assisted by hints based on non-deterministic policies outperform both human-only and computer-only agents in a web navigation task.

algorithm, decision support system, non-deterministic policy, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3175

AI Access Foundation

10682

Journal of Artificial Intelligence Research

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Michigan (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Asymptotic Synchronization for Finite-State Sources

Travers, Nicholas F., Crutchfield, James P.

arXiv.org Machine LearningJan-3-2011

We extend a recent synchronization analysis of exact finite-state sources to nonexact sources for which synchronization occurs only asymptotically. Although the proof methods are quite different, the primary results remain the same. We find that an observer's average uncertainty in the source state vanishes exponentially fast and, as a consequence, an observer's average uncertainty in predicting future output converges exponentially fast to the source entropy rate.

artificial intelligence, machine learning, observer, (16 more...)

arXiv.org Machine Learning

doi: 10.1007/s10955-011-0349-x

1011.1581

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Government > Military (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Exact Synchronization for Finite-State Sources

Travers, Nicholas F., Crutchfield, James P.

arXiv.org Machine LearningJan-3-2011

We analyze how an observer synchronizes to the internal state of a finite-state information source, using the epsilon-machine causal representation. Here, we treat the case of exact synchronization, when it is possible for the observer to synchronize completely after a finite number of observations. The more difficult case of strictly asymptotic synchronization is treated in a sequel. In both cases, we find that an observer, on average, will synchronize to the source state exponentially fast and that, as a result, the average accuracy in an observer's predictions of the source output approaches its optimal level exponentially fast as well. Additionally, we show here how to analytically calculate the synchronization rate for exact epsilon-machines and provide an efficient polynomial-time algorithm to test epsilon-machines for exactness.

artificial intelligence, machine learning, observer, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s10955-011-0342-4

1008.4182

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Predictive State Temporal Difference Learning

Boots, Byron, Gordon, Geoffrey J.

Neural Information Processing SystemsDec-31-2010

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications, reinforcement learning (RL) is complicated by the fact that state is either high-dimensional or partially observable. Therefore, RL methods are designed to work with features of state rather than state itself, and the success or failure of learning is often determined by the suitability of the selected features. By comparison, subspace identification (SSID) methods are designed to select a feature set which preserves as much information as possible about state. In this paper we connect the two approaches, looking at the problem of reinforcement learning with a large set of features, each of which may only be marginally useful for value function approximation. We introduce a new algorithm for this situation, called Predictive State Temporal Difference (PSTD) learning. As in SSID for predictive state representations, PSTD finds a linear compression operator that projects a large set of features down to a small set that preserves the maximum amount of predictive information. As in RL, PSTD then uses a Bellman recursion to estimate a value function. We discuss the connection between PSTD and prior approaches in RL and SSID. We prove that PSTD is statistically consistent, perform several experiments that illustrate its properties, and demonstrate its potential on a difficult optimal stopping problem.

banking & finance, upstream oil & gas, value function, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Industry:

Banking & Finance > Trading (0.68)
Energy > Oil & Gas > Upstream (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Construction of Dependent Dirichlet Processes based on Poisson Processes

Lin, Dahua, Grimson, Eric, Fisher, John W.

Neural Information Processing SystemsDec-31-2010

We present a novel method for constructing dependent Dirichlet processes. The approach exploits the intrinsic relationship between Dirichlet and Poisson processes in order to create a Markov chain of Dirichlet processes suitable for use as a prior over evolving mixture models. The method allows for the creation, removal, and location variation of component models over time while maintaining the property that the random measures are marginally DP distributed. Additionally, we derive a Gibbs sampling algorithm for model inference and test it on both synthetic and real data. Empirical results demonstrate that the approach is effective in estimating dynamically varying mixture models.

dirichlet process, particle, poisson process, (15 more...)

Neural Information Processing Systems

Country: