Sufficient Exploration for Convex Q-learning

Lu, Fan, Mehta, Prashant, Meyn, Sean, Neu, Gergely

Oct-17-2022–arXiv.org Artificial Intelligence

Ever since the introduction of Watkins' Q-learning algorithm in the 1980s, the research community has searched for a general theory beyond the so-called tabular settings (in which the function class spans all possible functions of state and action). The natural extension of Q-learning to general function approximation setting seeks to solve what is known as a projected Bellman equation (PBE). There are few results available giving sufficient conditions for the existence of a solution, or convergence of the algorithm if a solution does exist [24, 17, 10]. Counterexamples show that conditions on the function class are required in general, even in a linear function approximation setting [1, 25, 6]. The GQ-algorithm of [14] is one success story, based on a relaxation of the PBE. Even if existence and stability of the algorithm were settled, we would still face the challenge of interpreting the output of a Q-learning algorithm based on the PBE criterion.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Oct-17-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois (0.04)
  - New York > New York County
    - New York City (0.04)
  - Massachusetts
    - Middlesex County > Cambridge (0.14)
    - Suffolk County > Boston (0.04)
    - Plymouth County > Norwell (0.04)
  - Florida > Alachua County
    - Gainesville (0.14)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found