Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration

Apr-16-2025–arXiv.org Artificial Intelligence

Understanding the theoretical properties of projected Bellman equation, linear Q-learning, and approximate value iteration Han-Dong Lim limaries30@kaist.ac.kr Donghwan Lee donghwan@kaist.ac.kr Abstract In this paper, we study the theoretical properties of the projected Bellman equation (PBE) and two algorithms to solve this equation: linear Q-learning and approximate value iteration (A VI). We consider two sufficient conditions for the existence of a solution to PBE: strictly negatively row dominating diagonal (SNRDD) assumption and a condition motivated by the convergence of A VI. The SNRDD assumption also ensures the convergence of linear Q-learning, and its relationship with the convergence of A VI is examined. Lastly, several interesting observations on the solution of PBE are provided when using ϵ -greedy policy. 1 Introduction Reinforcement learning (RL) has achieved significant success, exemplified by the deep Q-network (DQN) (Mnih et al., 2015). This success can be largely ...

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

Apr-16-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found