Full Gradient Deep Reinforcement Learning for Average-Reward Criterion

Pagare, Tejas, Borkar, Vivek, Avrachenkov, Konstantin

Apr-7-2023–arXiv.org Artificial Intelligence

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al. (2021) to average reward problems. We experimentally compare widely used RVI Q-Learning with recently proposed Differential Q-Learning in the neural function approximation setting with Full Gradient DQN and DQN. We also extend this to learn Whittle indices for Markovian restless multi-armed bandits. We observe a better convergence rate of the proposed Full Gradient variant across different tasks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Apr-7-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > New Zealand (0.04)
- Europe > France (0.04)
- North America
  - Canada (0.04)
  - United States
    - Massachusetts
      - Suffolk County > Boston (0.04)
      - Middlesex County > Cambridge (0.04)
    - California > San Francisco County
      - San Francisco (0.14)
- Asia > India
  - NCT > New Delhi (0.04)
  - Maharashtra > Mumbai (0.04)

Genre:
- Research Report (0.64)

Industry:
- Automobiles & Trucks (0.46)
- Transportation
  - Ground > Road (0.46)
  - Electric Vehicle (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.36)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found