AITopics

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

A Reinforcement Learning Variant for Control Scheduling

Guha, Aloke

However, a large class of continuous control problems require maintaining the system at a desired operating point, or setpoint, at a given time. We refer to this problem as the basic setpoint control problem [Guha 90], and have shown that reinforcement learning can be used, not surprisingly, quite well for such control tasks.

controller, reinforcement, setpoint, (14 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > District of Columbia > Washington (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming

Sutton, Richard S.

This is a summary of results with Dyna, a class of architectures for intelligent systemsbased on approximating dynamic programming methods. Dyna architectures integrate trial-and-error (reinforcement) learning and execution-time planning into a single process operating alternately on the world and on a learned forward model of the world. We describe and show results for two Dyna architectures, Dyna-AHC and Dyna-Q. Using a navigation task, results are shown for a simple Dyna-AHC system which simultaneously learns by trial and error, learns a world model, and plans optimal routes using the evolving world model. We show that Dyna-Q architectures (based on Watkins's Q-Iearning) are easy to adapt for use in changing environments.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Reinforcement Learning in Markovian and Non-Markovian Environments

Schmidhuber, Jürgen

This work addresses three problems with reinforcement learning and adaptive neuro-control:1.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country:

Europe (0.28)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

A Reinforcement Learning Variant for Control Scheduling

Guha, Aloke

However, a large class of continuous control problems require maintaining the system at a desired operating point, or setpoint, at a given time. We refer to this problem as the basic setpoint control problem [Guha 90], and have shown that reinforcement learning can be used, not surprisingly, quite well for such control tasks. A more general version of the same problem requires steering the system from some 479 480 Guha initial or starting state to a desired state or setpoint at specific times without knowledge of the dynamics of the system. We therefore wish to examine how control scheduling tasks, where the system must be steered through a sequence of setpoints at specific times.

machine learning, reinforcement, reinforcement learning, (17 more...)

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Navigating through Temporal Difference

Dayan, Peter

Barto, Sutton and Watkins [2] introduced a grid task as a didactic example oftemporal difference planning and asynchronous dynamical pre gramming. Thispaper considers the effects of changing the coding of the input stimulus, and demonstrates that the self-supervised learning of a particular form of hidden unit representation improves performance.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.

Sequential Decision Problems and Neural Networks

Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.

algorithm, evaluation function, td algorithm, (13 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > Massachusetts > Middlesex County > Waltham (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ackley, David H., Littman, Michael L.

Generalization and Scaling in Reinforcement Learning

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

algorithm, generalization and scaling, vector, (13 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.

Sequential Decision Problems and Neural Networks

Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.

algorithm, evaluation function, td algorithm, (13 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > Massachusetts > Middlesex County > Waltham (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ackley, David H., Littman, Michael L.

Generalization and Scaling in Reinforcement Learning

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

algorithm, generalization and scaling, vector, (13 more...)