Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

Wu, Runzhe, Sekhari, Ayush, Krishnamurthy, Akshay, Sun, Wen

Jun-17-2024–arXiv.org Artificial Intelligence

In such settings, the learner aims to find a well performing policy by repeated interactions with the environment to acquire knowledge. Due to the high dimensionality of the problem, function approximation techniques are used to generalize the knowledge acquired across the state and action space. Under the broad category of function approximation, model-free RL stands out as a particularly popular approach due to its simplicity of implementation and relatively low sample efficiency in practice. In model-free RL, the learner uses function approximation (e.g., an expressive function class like deep neural networks) to model the state-action value function of various policies in the underlying MDP. In fact, the combination of model-free RL with various empirical exploration heuristics has led to notable empirical advances, including breakthroughs in game playing (Silver et al., 2016; Berner et al., 2019), robot manipulation (Andrychowicz et al., 2020), and self-driving (Chen et al., 2019). Theoretical advancements have paralleled the practical successes in RL, with tremendous progress in recent years in building rigorous statistical foundations to understand what structures in the environment and the function class suffice for sample-efficient RL.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Jun-17-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:
- Research Report (0.63)

Industry:
- Transportation (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.34)
    - Reinforcement Learning (1.00)
  - Representation & Reasoning (1.00)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found