AITopics

Planning and learning at multiple levels of temporal abstraction is a key problem for artificial intelligence. In this paper we summarize an approach to this problem based on the mathematical framework of Markov decision processes and reinforcement learning. Current model-based reinforcement learning is based on one-step models that cannot represent commonsense higher-level actions, such as going to lunch, grasping an object, or flying to Denver. This paper generalizes prior work on temporally abstract models [Sutton, 1995] and extends it from the prediction setting to include actions, control, and planning. We introduce a more general form of temporally abstract model, the multi-time model, and establish its suitability for planning and learning by virtue of its relationship to the Bellman equations. This paper summarizes the theoretical framework of multi-time models and illustrates their potential advantages in a grid world planning task.

abstract action, artificial intelligence, reinforcement learning, (14 more...)

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Nonparametric Model-Based Reinforcement Learning

Atkeson, Christopher G.

This paper describes some of the interactions of model learning algorithms and planning algorithms we have found in exploring model-based reinforcement learning. The paper focuses on how local trajectory optimizers can be used effectively with learned nonparametric models. We find that trajectory planners that are fully consistent with the learned model often have difficulty finding reasonable plans in the early stages of learning. Trajectory planners that balance obeying the learned model with minimizing cost (or maximizing reward) often do better, even if the plan is not fully consistent with the learned model.

artificial intelligence, optimization problem, trajectory, (17 more...)

Country: North America > United States (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)

Marbach, Peter, Mihatsch, Oliver, Schulte, Miriam, Tsitsiklis, John N.

Reinforcement Learning for Call Admission Control and Routing in Integrated Service Networks

We provide a model of the standard watermaze task, and of a more challenging task involving novel platform locations, in which rats exhibit one-trial learning after a few days of training. The model uses hippocampal place cells to support reinforcement learning, and also, in an integrated manner, to build and use allocentric coordinates. 1 INTRODUCTION

artificial intelligence, call admission control, télécommunications, (13 more...)

Country:

Europe (0.46)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Industry: Telecommunications (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

The Asymptotic Convergence-Rate of Q-learning

Szepesvári, Csaba

Q-Iearning is a popular reinforcement learning (RL) algorithm whose convergence is well demonstrated in the literature (Jaakkola et al., 1994; Tsitsiklis, 1994; Littman and Szepesvari, 1996; Szepesvari and Littman, 1996). Our aim in this paper is to provide an upper bound for the convergence rate of (lookup-table based) Q-Iearning algorithms. Although, this upper bound is not strict, computer experiments (to be presented elsewhere) and the form of the lemma underlying the proof indicate that the obtained upper bound can be made strict by a slightly more complicated definition for R. Our results extend to learning on aggregated states (see (Singh et al., 1995» and other related algorithms which admit a certain form of asynchronous stochastic approximation (see (Szepesv iri and Littman, 1996». Present address: Associative Computing, Inc., Budapest, Konkoly Thege M. u. 29-33, HUNGARY-1121 The Asymptotic Convergence-Rate of Q-leaming

algorithm, artificial intelligence, reinforcement learning, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.30)
Europe > Hungary > Budapest > Budapest (0.24)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Enhancing Q-Learning for Optimal Asset Allocation

Neuneier, Ralph

This paper enhances the Q-Iearning algorithm for optimal asset allocation proposed in (Neuneier, 1996 [6]). The new formulation simplifies the approach by using only one value-function for many assets and allows model-free policy-iteration. After testing the new algorithm on real data, the possibility of risk management within the framework of Markov decision problems is analyzed. The proposed methods allows the construction of a multi-period portfolio management system which takes into account transaction costs, the risk preferences of the investor, and several constraints on the allocation. 1 Introduction

artificial intelligence, banking & finance, investor, (19 more...)

Country: Europe > Germany (0.14)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Parr, Ronald, Russell, Stuart J.

Reinforcement Learning with Hierarchies of Machines

We present a new approach to reinforcement learning in which the policies considered by the learning process are constrained by hierarchies of partially specified machines. This allows for the use of prior knowledge to reduce the search space and provides a framework in which knowledge can be transferred across problems and in which component solutions can be recombined to solve larger and more complicated problems. Our approach can be seen as providing a link between reinforcement learning and "behavior-based" or "teleo-reactive" approaches to control. We present provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrate their effectiveness on a problem with several thousand states.

artificial intelligence, optimal policy, reinforcement learning, (16 more...)

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Munos, Rémi, Bourgine, Paul

Reinforcement Learning for Continuous Stochastic Control Problems

Here we sudy the continuous time, continuous state-space stochastic case, which covers a wide variety of control problems including target, viability, optimization problems (see [FS93], [KP95])}or which a formalism is the following.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Country: Europe (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Parr, Ronald, Russell, Stuart J.

Reinforcement Learning with Hierarchies of Machines

We present a new approach to reinforcement learning in which the policies consideredby the learning process are constrained by hierarchies of partially specified machines. This allows for the use of prior knowledge to reduce the search space and provides a framework in which knowledge can be transferred across problems and in which component solutions can be recombined to solve larger and more complicated problems. Our approach can be seen as providing a link between reinforcement learning and"behavior-based" or "teleo-reactive" approaches to control. We present provably convergent algorithms for problem-solving and learning withhierarchical machines and demonstrate their effectiveness on a problem with several thousand states.

artificial intelligence, optimal policy, reinforcement learning, (17 more...)

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Precup, Doina, Sutton, Richard S.

Multi-time Models for Temporally Abstract Planning

The Natural abstract actions are to move from room to room. 1 Reinforcement Learning (MDP) Framework In reinforcement learning, a learning agent interacts with an environment at some discrete, lowest-level time scale t 0,1,2, ... On each time step, the agent perceives the state of the environment, St, and on that basis chooses a primitive action, at.

abstract action, artificial intelligence, reinforcement learning, (14 more...)

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

TRACKIES: RoboCup-97 Middle-Size League World Cochampion

Asada, Minoru, Suzuki, Sho'ji, Takahashi, Yasutake, Uchibe, Eiji, Nakamura, Masateru, Mishima, Chizuko, Ishizuka, Hiroshi, Kato, Tatsunori

AI MagazineSep-15-1998

This article describes a milestone in our research efforts toward the real robot competition in RoboCup. We participated in the middle-size league at RoboCup-97, held in conjunction with the Fifteenth International Joint Conference on Artificial Intelligence in Nagoya, Japan. The most significant features of our team, TRACKIES, are the application of a reinforcement learning method enhanced for real robot applications and the use of an omnidirectional vision system for our goalie that can capture a 360-degree view at any instant in time. The method and the system used are shown with competition results.

Computer Engineering, soccer, tracky, (8 more...)

AI Magazine

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Information Technology > Robotics & Automation (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Soccer Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)