Regularized Q-Learning with Linear Function Approximation

Xi, Jiachen, Garcia, Alfredo, Momcilovic, Petar

Jan-26-2024–arXiv.org Artificial Intelligence

Several successful reinforcement learning algorithms make use of regularization to promote multi-modal policies that exhibit enhanced exploration and robustness. With functional approximation, the convergence properties of some of these algorithms (e.g. soft Q-learning) are not well understood. In this paper, we consider a single-loop algorithm for minimizing the projected Bellman error with finite time convergence guarantees in the case of linear function approximation. The algorithm operates on two scales: a slower scale for updating the target network of the state-action values, and a faster scale for approximating the Bellman backups in the subspace of the span of basis vectors. We show that, under certain assumptions, the proposed algorithm converges to a stationary point in the presence of Markovian noise. In addition, we provide a performance guarantee for the policies derived from the proposed algorithm.

algorithm, approximation, function approximation, (15 more...)

arXiv.org Artificial Intelligence

Jan-26-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas > Brazos County
    - College Station (0.14)
  - California > Alameda County
    - Berkeley (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Fuzzy Logic (0.62)