AITopics

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.

Sequential Decision Problems and Neural Networks

Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.

algorithm, evaluation function, td algorithm, (13 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > Massachusetts > Middlesex County > Waltham (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ackley, David H., Littman, Michael L.

Generalization and Scaling in Reinforcement Learning

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement function computes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation (CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

algorithm, generalization and scaling, vector, (13 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.05)
North America > United States > Massachusetts > Hampshire County > Amherst (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Barto, A. G., Sutton, R. S., Watkins, C. J. C. H.

Sequential Decision Problems and Neural Networks

Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses reinforcement learningin terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > Massachusetts > Middlesex County (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ackley, David H., Littman, Michael L.

Generalization and Scaling in Reinforcement Learning

In associative reinforcement learning, an environment generates input vectors, a learning system generates possible output vectors, and a reinforcement functioncomputes feedback signals from the input-output pairs. The task is to discover and remember input-output pairs that generate rewards. Especially difficult cases occur when rewards are rare, since the expected time for any algorithm can grow exponentially with the size of the problem. Nonetheless, if a reinforcement function possesses regularities, and a learning algorithm exploits them, learning time can be reduced below that of non-generalizing algorithms. This paper describes a neural network algorithm called complementary reinforcement back-propagation(CRBP), and reports simulation results on problems designed to offer differing opportunities for generalization.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Country: North America > United States > Massachusetts (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Learning to Coordinate Behaviors

Maes, Pattie | Brooks, Rodney

ClassicsFeb-1-1990

In accordance with the philosophy of behaviorbased robots, the algorithm is completely distributed: each of the behaviors independently tries to sensors find out (i) whether it is relevant (i.e.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Classics

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > Erie County > Buffalo (0.04)
North America > United States > New Jersey (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Parker, David B., Gluck, Mark, Reifsnider, Eric S.

Learning with Temporal Derivatives in Pulse-Coded Neuronal Systems

Neural Information Processing SystemsDec-31-1989

To more precisely evaluate the implications of a neuronal model.

classical conditioning, conditioning, frequency, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.05)
North America > United States > California > Santa Clara County > Stanford (0.05)
(4 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Parker, David B., Gluck, Mark, Reifsnider, Eric S.

Learning with Temporal Derivatives in Pulse-Coded Neuronal Systems

Neural Information Processing SystemsDec-31-1989

To more precisely evaluate the implications of a neuronal model.

classical conditioning, conditioning, frequency, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.05)
North America > United States > California > Santa Clara County > Stanford (0.05)
(4 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Parker, David B., Gluck, Mark, Reifsnider, Eric S.

Learning with Temporal Derivatives in Pulse-Coded Neuronal Systems

Neural Information Processing SystemsDec-31-1989

Reifsnider A number of learning models have recently been proposed which involve calculations of temporal differences (or derivatives in continuous-time models).

classical conditioning, conditioning, frequency, (16 more...)