DreamerV3-XP: Optimizing exploration through uncertainty estimation

Bierling, Lukas, Pasero, Davide, Bertrand, Jan-Henrik, Van Gerwen, Kiki

Oct-27-2025–arXiv.org Artificial Intelligence

We introduce DreamerV3-XP, an extension of DreamerV3 that improves exploration and learning efficiency. This includes (i) a prioritized replay buffer, scoring trajectories by return, reconstruction loss, and value error and (ii) an intrinsic reward based on disagreement over predicted environment rewards from an ensemble of world models. DreamerV3-XP is evaluated on a subset of Atari100k and DeepMind Control Visual Benchmark tasks, confirming the original DreamerV3 results and showing that our extensions lead to faster learning and lower dynamics model loss, particularly in sparse-reward settings.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-27-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.95)
  - Cognitive Science > Problem Solving (0.40)
  - Natural Language > Large Language Model (0.35)
  - Machine Learning > Neural Networks (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found