On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network Parametrization

Gaur, Mudit, Aggarwal, Vaneet, Agarwal, Mridul

Jan-30-2023–arXiv.org Artificial Intelligence

Deep Q-learning based algorithms have been applied successfully in many decision making problems, while their theoretical foundations are not as well understood. In this paper, we study a Fitted Q-Iteration with two-layer ReLU neural network parameterization, and find the sample complexity guarantees for the algorithm. Our approach estimates the Q-function in each iteration using a convex optimization problem. We show that this approach achieves a sample complexity of $\tilde{\mathcal{O}}(1/\epsilon^{2})$, which is order-optimal. This result holds for a countable state-spaces and does not require any assumptions such as a linear or low rank structure on the MDP.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Jan-30-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Indiana > Tippecanoe County
    - West Lafayette (0.04)
    - Lafayette (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.50)

Industry:
- Energy (0.67)
- Leisure & Entertainment > Games (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found