iLSTD: Eligibility Traces and Convergence Analysis
Geramifard, Alborz, Bowling, Michael, Zinkevich, Martin, Sutton, Richard S.
–Neural Information Processing Systems
In this paper, we generalize the previous iLSTD algorithm and present three new results: (1) the first convergence proof for an iLSTD algorithm; (2) an extension to incorporate eligibility traces without changing the asymptotic computational complexity; and (3) the first empirical results with an iLSTD algorithm for a problem (mountain car) with feature vectors large enough (n 10, 000) to show substantial computational advantages over LSTD.
Neural Information Processing Systems
Dec-31-2007
- Genre:
- Research Report > New Finding (0.67)