Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies

Yuan, Rui, Du, Simon S., Gower, Robert M., Lazaric, Alessandro, Xiao, Lin

Feb-21-2023–arXiv.org Artificial Intelligence

We consider infinite-horizon discounted Markov decision processes and study the convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log-linear policy class. Using the compatible function approximation framework, both methods with log-linear policies can be written as inexact versions of the policy mirror descent (PMD) method. We show that both methods attain linear convergence rates and $\tilde{\mathcal{O}}(1/\epsilon^2)$ sample complexities using a simple, non-adaptive geometrically increasing step size, without resorting to entropy or other strongly convex regularization. Lastly, as a byproduct, we obtain sublinear convergence rates for both methods with arbitrary constant step size.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Feb-21-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
    - New York > New York County
      - New York City (0.14)
    - New Jersey > Mercer County
      - Princeton (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Russia (0.04)
  - United Kingdom
    - Scotland > City of Edinburgh
      - Edinburgh (0.04)
    - England
      - Oxfordshire > Oxford (0.04)
      - Cambridgeshire > Cambridge (0.04)
  - France > Hauts-de-France
    - Nord > Lille (0.04)
- Asia
  - Russia (0.04)
  - Middle East > Jordan (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found