Learning to predict by the methods of temporal difference

Feb-1-1988–Classics

This article introduces a class of incremental learning procedures specializedfor prediction that is, for using past experience with an incompletely knownsystem to predict its future behavior. Whereas conventional prediction-learningmethods assign credit by means of the difference between predicted and actual outcomes,tile new methods assign credit by means of the difference between temporallysuccessive predictions. Although such temporal-difference method~ have been used inSamuel's checker player, Holland's bucket brigade, and the author's Adaptive HeuristicCritic, they have remained poorly understood. Here we prove their convergenceand optimality for special cases and relate them to supervised-learning methods. Formost real-world prediction problems, telnporal-differenee methods require less memoryand less peak computation than conventional methods and they produce moreaccurate predictions. We argue that most problems to which supervised learningis currently applied are really prediction problemsMachine Learning 3: 9-44, erratum p. 377

artificial intelligence, prediction, reinforcement learning, (19 more...)

Classics

Feb-1-1988

Classics PDF

Add feedback

Country:
- North America > United States
  - California > Orange County
    - Irvine (0.14)
  - Massachusetts > Middlesex County (0.14)

Genre:
- Workflow (0.46)

Industry:
- Leisure & Entertainment > Games (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found