Model-based Reinforcement Learning and the Eluder Dimension

Mar-13-2024, 06:15:47 GMT–Neural Information Processing Systems

We consider the problem of learning to optimize an unknown Markov decision process (MDP). We show that, if the MDP can be parameterized within some known function class, we can obtain regret bounds that scale with the dimensionality, rather than cardinality, of the system.

dimension, eluder dimension, reinforcement, (14 more...)

Neural Information Processing Systems

Mar-13-2024, 06:15:47 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County
    - Belmont (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.68)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.48)