Explicit Planning for Efficient Exploration in Reinforcement Learning

May-23-2025, 16:02:34 GMT–Neural Information Processing Systems

Efficient exploration is crucial to achieving good performance in reinforcement learning. Existing systematic exploration strategies (R-MAX, MBIE, UCRL, etc.), despite being promising theoretically, are essentially greedy strategies that follow some predefined heuristics. When the heuristics do not match the dynamics of Markov decision processes (MDPs) well, an excessive amount of time can be wasted in travelling through already-explored states, lowering the overall efficiency. We argue that explicit planning for exploration can help alleviate such a problem, and propose a Value Iteration for Exploration Cost (VIEC) algorithm which computes the optimal exploration scheme by solving an augmented MDP.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

May-23-2025, 16:02:34 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > West Midlands > Birmingham (0.40)
- North America > United States (1.00)

Industry:
- Energy > Oil & Gas > Upstream (0.35)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.49)
  - Reinforcement Learning (0.86)

Duplicate Docs Excel Report

Title
Explicit Planning for Efficient Exploration in Reinforcement Learning

Similar Docs Excel Report more

Title	Similarity	Source
None found