AITopics

We describe the reinforcement learning problem, motivate algorithms whichseek an approximation to the Q function, and present new convergence results for two such algorithms. 1 INTRODUCTION AND BACKGROUND Imagine an agent acting in some environment. At time t, the environment is in some state Xt chosen from a finite set of states. The agent perceives Xt, and is allowed to choose an action at from some finite set of actions. Meanwhile, the agent experiences a real-valued cost Ct, chosen from a distribution which also depends only on Xt and at and which has finite mean and variance. Such an environment is called a Markov decision process, or MDP.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England (0.14)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Zhang, Wei, Dietterich, Thomas G.

High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network

In Tesauro's 1992 landmark work on TD--gammon, he showed that the temporal dif--

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country: North America > United States > Oregon (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Government > Space Agency (0.48)
Government > Regional Government > North America Government > United States Government (0.48)
Leisure & Entertainment > Games > Backgammon (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Crites, Robert H., Barto, Andrew G.

Improving Elevator Performance Using Reinforcement Learning

This paper describes the application of reinforcement learning (RL) to the difficult real world problem of elevator dispatching. The elevator domainposes a combination of challenges not seen in most RL research to date. Elevator systems operate in continuous state spaces and in continuous time as discrete event dynamic systems. Their states are not fully observable and they are nonstationary due to changing passenger arrival rates. In addition, we use a team of RL agents, each of which is responsible for controlling one elevator car.The team receives a global reinforcement signal which appears noisy to each agent due to the effects of the actions of the other agents, the random nature of the arrivals and the incomplete observation of the state.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.16)

Industry: Transportation > Passenger (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Optimal Asset Allocation using Adaptive Dynamic Programming

Neuneier, Ralph

Ralph Neuneier* Siemens AG, Corporate Research and Development Otto-Hahn-Ring 6, D-81730 Munchen, Germany Abstract In recent years, the interest of investors has shifted to computerized assetallocation (portfolio management) to exploit the growing dynamics of the capital markets. In this paper, asset allocation is formalized as a Markovian Decision Problem which can be optimized byapplying dynamic programming or reinforcement learning based algorithms. Using an artificial exchange rate, the asset allocation strategyoptimized with reinforcement learning (Q-Learning) is shown to be equivalent to a policy computed by dynamic programming. Theapproach is then tested on the task to invest liquid capital in the German stock market. Here, neural networks are used as value function approximators.

machine learning, portfolio, reinforcement learning, (15 more...)

Country: Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.24)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Choi, Samuel P. M., Yeung, Dit-Yan

Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control

The controllers usually have no or only very little prior knowledge of the environment. While only local communication between controllers is allowed, the controllers must cooperate among themselves to achieve the common, global objective. Finding the optimal routing policy in such a distributed manner is very difficult. Moreover, since the environment is non-stationary, the optimal policy varies with time as a result of changes in network traffic and topology.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry:

Transportation (0.36)
Telecommunications (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Mahadevan, Sridhar, Kaelbling, Leslie Pack

The National Science Foundation Workshop on Reinforcement Learning

AI MagazineDec-15-1996

Reinforcement learning has become one of the most actively studied learning frameworks in the area of intelligent autonomous agents. This article describes the results of a three-day meeting of leading researchers in this area that was sponsored by the National Science Foundation. Because reinforcement learning is an interdisciplinary topic, the workshop brought together researchers from a variety of fields, including machine learning, neural networks, AI, robotics, and operations research. The goals of the meeting were to (1) understand limitations of current reinforcement-learning systems and define promising directions for further research; (2) clarify the relationships between reinforcement learning and existing work in engineering fields, such as operations research; and (3) identify potential industrial applications of reinforcement learning.

artificial intelligence, management and information, reinforcement learning, (5 more...)

AI Magazine

Industry: Government > Regional Government > North America Government > US Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Mahadevan, Sridhar, Kaelbling, Leslie Pack

The National Science Foundation Workshop on Reinforcement Learning

AI MagazineDec-15-1996

Reinforcement learning has become one of the most actively studied learning frameworks in the area of intelligent autonomous agents. This article describes the results of a three-day meeting of leading researchers in this area that was sponsored by the National Science Foundation. Because reinforcement learning is an interdisciplinary topic, the workshop brought together researchers from a variety of fields, including machine learning, neural networks, AI, robotics, and operations research. Thirty leading researchers from the United States, Canada, Europe, and Japan, representing from many different universities, government, and industrial research laboratories participated in the workshop. The goals of the meeting were to (1) understand limitations of current reinforcement-learning systems and define promising directions for further research; (2) clarify the relationships between reinforcement learning and existing work in engineering fields, such as operations research; and (3) identify potential industrial applications of reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

AI Magazine

Country: North America > United States > California (0.28)

Industry: Government > Regional Government > North America Government > United States Government (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Kaelbling, L. P., Littman, M. L., Moore, A. W.

Reinforcement Learning: A Survey

Journal of Artificial Intelligence ResearchMay-1-1996

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.

reinforcement learning

Journal of Artificial Intelligence Research

doi: 10.1613/jair.301

AI Access Foundation

10166

Journal of Artificial Intelligence Research

Genre: Overview (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Boyan, Justin A., Moore, Andrew W.

Generalization in Reinforcement Learning: Safely Approximating the Value Function

Neural Information Processing SystemsDec-31-1995

Reinforcement learning-the problem of getting an agent to learn to act from sparse, delayed rewards-has been advanced by techniques based on dynamic programming (DP). These algorithms compute a value function which gives, for each state, the minimum possible long-term cost commencing in that state. For the high-dimensional and continuous state spaces characteristic of real-world control tasks, a discrete representation of the value function is intractable; some form of generalization is required. A natural way to incorporate generalization into DP is to use a function approximator, rather than a lookup table, to represent the value function. This approach, which dates back to uses of Legendre polynomials in DP [Bellman et al., 19631, has recently worked well on several dynamic control problems [Mahadevan and Connell, 1990, Lin, 1993] and succeeded spectacularly on the game of backgammon [Tesauro, 1992, Boyan, 1992]. On the other hand, many sensible implementations have been less successful [Bradtke, 1993, Schraudolph et al., 1994]. Indeed, given the well-established success 370 Justin Boyan, Andrew W. Moore

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Massachusetts (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Industry: Leisure & Entertainment > Games > Backgammon (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Pouget, Alexandre, Deffayet, Cedric, Sejnowski, Terrence J.

Reinforcement Learning Predicts the Site of Plasticity for Auditory Remapping in the Barn Owl

Neural Information Processing SystemsDec-31-1995

In young barn owls raised with optical prisms over their eyes, these auditory maps are shifted to stay in register with the visual map, suggesting that the visual input imposes a frame of reference on the auditory maps. However, the optic tectum, the first site of convergence of visual with auditory information, is not the site of plasticity for the shift of the auditory maps; the plasticity occurs instead in the inferior colliculus, which contains an auditory map and projects into the optic tectum. We explored a model of the owl remapping in which a global reinforcement signal whose delivery is controlled by visual foveation. A hebb learning rule gated by reinforcement learned to appropriately adjust auditory maps. In addition, reinforcement learning preferentially adjusted the weights in the inferior colliculus, as in the owl brain, even though the weights were allowed to change throughout the auditory system. This observation raises the possibility that the site of learning does not have to be genetically specified, but could be determined by how the learning procedure interacts with the network architecture.

machine learning, reinforcement, reinforcement learning, (15 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)