Reinforcement Learning by Probability Matching

Dec-31-1996–Neural Information Processing Systems

We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment's reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to train mixture of experts networks, an architecture for which other reinforcement learning rules fail to converge reliably on even simple problems. This architecture is particularly well suited for our algorithm as it can compute arbitrarily complex functions yet calculation of the output probability is simple. 1 INTRODUCTION The problem of learning associative networks from scalar reinforcement signals is notoriously difficult.

algorithm, artificial intelligence, reinforcement learning, (14 more...)

Neural Information Processing Systems

Dec-31-1996

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts
  - Hampshire County > Amherst (0.14)
  - Middlesex County > Cambridge (0.14)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
Reinforcement Learning by Probability Matching

Similar Docs Excel Report more

Title	Similarity	Source
None found