Practical Issues in Temporal Difference Learning

Tesauro, Gerald

Neural Information Processing Systems 

TO('\) is an algorithm for adjusting the weights in a connectionist network which 259 260 Tesauro has the following form: