Convergent Combinations of Reinforcement Learning with Linear Function Approximation

Dec-31-2003–Neural Information Processing Systems

Convergence for iterative reinforcement learning algorithms like TD(O) depends on the sampling strategy for the transitions. However, inpractical applications it is convenient to take transition data from arbitrary sources without losing convergence. In this paper we investigate the problem of repeated synchronous updates based on a fixed set of transitions. This allows to analyse if a certain reinforcement learning algorithm and a certain functionapproximator are compatible. For the combination of the residual gradient algorithm with grid-based linear interpolation we show that there exists a universal constant learning rate such that the iteration converges independently of the concrete transition data. 1 Introduction The strongest convergence guarantees for reinforcement learning (RL) algorithms are available for the tabular case, where temporal difference algorithms for both policy evaluation and the general control problem converge with probability one independently of the concrete sampling strategy as long as all states are sampled infinitely often and the learning rate is decreased appropriately [2].

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Dec-31-2003

Conferences PDF

Add feedback

Country:
- Europe > Germany (0.28)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Fuzzy Logic (0.44)

Duplicate Docs Excel Report

Title
Convergent Combinations of Reinforcement Learning with Linear Function Approximation
Convergent Combinations of Reinforcement Learning with Linear Function Approximation

Similar Docs Excel Report more

Title	Similarity	Source
None found