Exploration via Epistemic Value Estimation

Schmitt, Simon, Shawe-Taylor, John, van Hasselt, Hado

Mar-7-2023–arXiv.org Artificial Intelligence

How to efficiently explore in reinforcement learning is an open problem. Many exploration algorithms employ the epistemic uncertainty of their own value predictions -- for instance to compute an exploration bonus or upper confidence bound. Unfortunately the required uncertainty is difficult to estimate in general with function approximation. We propose epistemic value estimation (EVE): a recipe that is compatible with sequential decision making and with neural network function approximators. It equips agents with a tractable posterior over all their parameters from which epistemic value uncertainty can be computed efficiently. We use the recipe to derive an epistemic Q-Learning agent and observe competitive performance on a series of benchmarks. Experiments confirm that the EVE recipe facilitates efficient exploration in hard exploration tasks.

approximation, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Mar-7-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Ontario
    - Toronto (0.28)
  - United States (1.00)

Genre:
- Research Report (0.50)
- Workflow (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.93)
    - Neural Networks (1.00)
    - Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found