Efficient Planning in Large MDPs with Weak Linear Function Approximation

Jul-13-2020–arXiv.org Machine Learning

Large-scale Markov decision processes (MDPs) require planning algorithms with runtime independent of the number of states of the MDP. We consider the planning problem in MDPs using linear value function approximation with only weak requirements: low approximation error for the optimal value function, and a small set of "core" states whose features span those of other states. In particular, we make no assumptions about the representability of policies or value functions of non-optimal policies. Our algorithm produces almost-optimal actions for any state using a generative oracle (simulator) for the MDP, while its computation time scales polynomially with the number of features, core states, and actions and the effective horizon.

approx, fuzzy logic, optimization problem, (22 more...)

arXiv.org Machine Learning

Jul-13-2020

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.14)
  - United States > Massachusetts (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.48)
    - Reinforcement Learning (0.68)
  - Representation & Reasoning
    - Optimization (0.94)
    - Planning & Scheduling (1.00)
    - Uncertainty > Fuzzy Logic (0.60)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found