Temporal-Difference Networks

Apr-6-2023, 15:47:43 GMT–Neural Information Processing Systems

We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predictions. Rather than relating a single pre- diction to itself at a later time, as in conventional TD methods, a TD network relates each prediction in a set of predictions to other predic- tions in the set at a later time. TD networks can represent and apply TD learning to a much wider class of predictions than has previously been possible. Using a random-walk example, we show that these networks can be used to learn to predict by a fixed interval, which is not possi- ble with conventional TD methods. Secondly, we show that if the inter- predictive relationships are made conditional on action, then the usual learning-efficiency advantage of TD methods over Monte Carlo (super- vised learning) methods becomes particularly pronounced.

prediction, td method, td network, (9 more...)

Neural Information Processing Systems

Apr-6-2023, 15:47:43 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)