Temporal Abstraction in Temporal-difference Networks
Rafols, Eddie, Koop, Anna, Sutton, Richard S.
–Neural Information Processing Systems
We present a generalization of temporal-difference networks to include temporally abstract options on the links of the question network. Temporal-difference (TD) networks have been proposed as a way of representing and learning a wide variety of predictions about the interaction between an agent and its environment. These predictions are compositional in that their targets are defined in terms of other predictions, and subjunctive in that that they are about what would happen if an action or sequence of actions were taken. In conventional TD networks, the interrelated predictions are at successive time steps and contingent on a single action; here we generalize them to accommodate extended time intervals and contingency on whole ways of behaving. Our generalization is based on the options framework for temporal abstraction. The primary contribution of this paper is to introduce a new algorithm for intra-option learning in TD networks with function approximation and eligibility traces.
Neural Information Processing Systems
Dec-31-2006
- Country:
- North America
- Canada > Alberta (0.28)
- United States > California
- San Francisco County > San Francisco (0.14)
- North America
- Genre:
- Research Report (0.46)
- Technology: