AITopics

1109.1314

Country: Europe (0.28)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Leisure & Entertainment > Games > Chess (1.00)
Information Technology > Software (0.93)
Education > Assessment & Standards > Measuring Intelligence (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Azar, Mohammad Gheshlaghi, Gomez, Vicenc, Kappen, Hilbert J.

Dynamic Policy Programming

arXiv.org Artificial IntelligenceSep-6-2011

In this paper, we propose a novel policy iteration method, called dynamic policy programming (DPP), to estimate the optimal policy in the infinite-horizon Markov decision processes. We prove the finite-iteration and asymptotic l\infty-norm performance-loss bounds for DPP in the presence of approximation/estimation error. The bounds are expressed in terms of the l\infty-norm of the average accumulated error as opposed to the l\infty-norm of the error in the case of the standard approximate value iteration (AVI) and the approximate policy iteration (API). This suggests that DPP can achieve a better performance than AVI and API since it averages out the simulation noise caused by Monte-Carlo sampling throughout the learning process. We examine this theoretical results numerically by com- paring the performance of the approximate variants of DPP with existing reinforcement learning (RL) methods on different problem domains. Our results show that, in all cases, DPP-based algorithms outperform other RL methods by a wide margin.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1004.2027

Country: Europe (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Kormushev, Petar, Nomoto, Kohei, Dong, Fangyan, Hirota, Kaoru

Time Hopping technique for faster reinforcement learning in simulations

arXiv.org Artificial IntelligenceSep-6-2011

This preprint has been withdrawn by the author for revision

artificial intelligence, machine learning, time hopping technique, (3 more...)

0904.0545

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

arXiv.org Machine LearningSep-5-2011

Self-configuration from a Machine-Learning Perspective

Konen, Wolfgang

The goal of machine learning is to provide solutions which are trained by data or by experience coming from the environment. Many training algorithms exist and some brilliant successes were achieved. But even in structured environments for machine learning (e.g. data mining or board games), most applications beyond the level of toy problems need careful hand-tuning or human ingenuity (i.e. detection of interesting patterns) or both. We discuss several aspects how self-configuration can help to alleviate these problems. One aspect is the self-configuration by tuning of algorithms, where recent advances have been made in the area of SPO (Sequen- tial Parameter Optimization). Another aspect is the self-configuration by pattern detection or feature construction. Forming multiple features (e.g. random boolean functions) and using algorithms (e.g. random forests) which easily digest many fea- tures can largely increase learning speed. However, a full-fledged theory of feature construction is not yet available and forms a current barrier in machine learning. We discuss several ideas for systematic inclusion of feature construction. This may lead to partly self-configuring machine learning solutions which show robustness, flexibility, and fast learning in potentially changing environments.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1105.1951

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Nguyen, Phuong, Sunehag, Peter, Hutter, Marcus

Feature Reinforcement Learning In Practice

arXiv.org Artificial IntelligenceAug-17-2011

Following a recent surge in using history-based methods for resolving perceptual aliasing in reinforcement learning, we introduce an algorithm based on the feature reinforcement learning framework called PhiMDP. To create a practical algorithm we devise a stochastic search procedure for a class of context trees based on parallel tempering and a specialized proposal distribution. We provide the first empirical evaluation for PhiMDP. Our proposed algorithm achieves superior performance to the classical U-tree algorithm and the recent active-LZ algorithm, and is competitive with MC-AIXI-CTW that maintains a bayesian mixture over all context trees up to a chosen depth.We are encouraged by our ability to compete with this sophisticated method using an algorithm that simply picks one single model, and uses Q-learning on the corresponding MDP. Our PhiMDP algorithm is much simpler, yet consumes less time and memory. These results show promise for our future work on attacking more complex and larger problems.

machine learning, node, reinforcement learning, (15 more...)

1108.3614

Country: North America (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Between Frustration and Elation: Sense of Control Regulates the lntrinsic Motivation for Motor Learning

Grzyb, Beata J. (Jaume I University and Osaka University) | Boedecker, Joschka (Osaka University) | Asada, Minoru (Osaka University) | Pobil, Angel P. del (Jaume I University) | Smith, Linda B. (Indiana University)

Frustration has been generally viewed in a negative light and its potential role in learning neglected. We propose a new approach to intrinsically motivated learning where frustration is a key factor that allows to dynamically balance exploration and exploitation. Moreover, based on the result obtained from our experiment with older infants, we propose that a temporary decrease in learning from negative feedback can also be beneficial in fine-tuning a newly learned behavior. We suggest that this temporal indifference to the outcome of an action may be related to the sense of control, and results from the state of elation, that is the experience of overcoming a very difficult task after prolonged frustration. Our preliminary simulation results serve as a proof-of-concept for our approach.

educational setting, frustration, upstream oil & gas, (18 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.46)
Europe > Spain (0.14)
Asia > Japan (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Education (0.68)
Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Robots (0.72)

Deep Belief Nets as Function Approximators for Reinforcement Learning

Abtahi, Farnaz (University of Arizona) | Fasel, Ian (University of Arizona)

We describe a continuous state/action reinforcement learning method which uses deep belief networks (DBNs) in conjunction with a value function-based reinforcement learning algorithm to learn effective control policies. Our approach is to first learn a model of the state-action space from data in an unsupervised pre-training phase, and then use neural-fitted Q-iteration (NFQ) to learn an accurate value function approximator (analogous to a "fine-tuning" phase when training DBNs for classification). Our experiments suggest that this approach has the potential to significantly increase the efficiency of the learning process in NFQ, provided care is taken to ensure the initial data covers interesting areas of the state-action space, and may be particularly useful in transfer learning settings.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > Arizona > Pima County > Tucson (0.14)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Markov Games of Incomplete Information for Multi-Agent Reinforcement Learning

MacDermed, Liam (Georgia Institute of Technology) | Isbell, Charles (Georgia Institute of Technology) | Weiss, Lora (Georgia Institute of Technology)

Partially observable stochastic games (POSGs) are an attractive model for many multi-agent domains, but are computationally extremely difficult to solve. We present a new model, Markov games of incomplete information (MGII) which imposes a mild restriction on POSGs while overcoming their primary computational bottleneck. Finally we show how to convert a MGII into a continuous but bounded fully observable stochastic game. MGIIs represents the most general tractable model for multi-agent reinforcement learning to date.

bayesian game, information, posg, (15 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.95)

Choe, Yoonsuck (Texas A&M University)

Action-Based Autonomous Grounding

When a new-born animal (agent) opens its eyes, what it sees is a patchwork of light and dark patterns, the natural scene.What is perceived by the agent at this moment is based on the patternof neural spikes in its brain. Life-long learning begins with such a flood of spikes in the brain. All knowledge and skills learned by the agent are mediated by such spikes, thus it is critical to understand what information these spikes convey and how they can be used to generate meaningful behavior. Here, we consider how agents can autonomously understand the meaning of these spikes without direct reference to the stimulus. We find that this problem, the problem of grounding, is unsolvable if the agent is passively perceiving, and that it can be solved only through self-initiated action. Furthermore, we show that a simple criterion, combined with standard reinforcement learning, can help solve this problem. We will present simulation results and discuss the implications of these results on life-long learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Brazos County > College Station (0.05)

Industry: Education > Educational Setting > Continuing Education (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.37)

Torrey, Lisa (St. Lawrence University)

Lightweight Adaptation in Model-Based Reinforcement Learning

Reinforcement learning algorithms can train an agent to operate successfully in a stationary environment. Most real-world environments, however, are subject to change over time. Research in the areas of transfer learning and lifelong learning addresses this problem by developing new algorithms that allow agents to adapt to environment change. Current trends in this area include model-free learning and data-driven adaptation methods. This paper explores in the opposite direction of those trends. Arguing that model-based algorithms may be better suited to the problem, it looks at adaptation in the context of model-based learning. Noting that standard algorithms themselves have some built-in capability for adaptation, it analyzes when and why a standard algorithm struggles to adapt to environment change. Then it experiments with lightweight and straightforward methods for adapting effectively.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Genre: Instructional Material (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)