Practical Issues in Temporal Difference Learning

Tesauro, Gerald

Neural Information Processing Systems 

This paper examines whether temporal difference methods for training connectionist networks, such as Suttons's TO('\) algorithm, can be successfully appliedto complex real-world problems. A number of important practical issues are identified and discussed from a general theoretical perspective. Thesepractical issues are then examined in the context of a case study in which TO('\) is applied to learning the game of backgammon from the outcome of self-play. This is apparently the first application of this algorithm to a complex nontrivial task. It is found that, with zero knowledge built in, the network is able to learn from scratch to play the entire game at a fairly strong intermediate level of performance, which is clearly better than conventional commercial programs, and which in fact surpasses comparable networks trained on a massive human expert data set. The hidden units in these network have apparently discovered useful features, a longstanding goal of computer games research.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found