Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Dec-31-1996–Neural Information Processing Systems

On large problems, reinforcement learning systems must use parameterized function approximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational results have been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative results in attempting to apply dynamic programming together with function approximation to simple control problems with continuous state spaces. In this paper, we present positive results for all the control tasks they attempted, and for one that is significantly larger. The most important differences are that we used sparse-coarse-coded function approximators (CMACs) whereas they used mostly global function approximators, and that we learned online whereas they learned offline. Boyan and Moore and others have suggested that the problems they encountered could be solved by using actual outcomes ("rollouts"), as in classical Monte Carlo methods, and as in the TD().) algorithm when).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Dec-31-1996

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.14)
- North America > United States
  - California > San Francisco County
    - San Francisco (0.14)
  - Massachusetts > Hampshire County
    - Amherst (0.15)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Similar Docs Excel Report more

Title	Similarity	Source
None found