AITopics | dreamerv2

Collaborating Authors

dreamerv2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Robust Dynamics through Variational Sparse Gating

Neural Information Processing SystemsApr-24-2026, 12:10:36 GMT

Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes. World models have previously been shown to improve sample-efficiency in simulated environments with few objects, but have not yet been applied successfully to environments with many objects. In environments with many objects, often only a small number of them are moving or interacting at the same time. In this paper, we investigate integrating this inductive bias of sparse interactions into the latent dynamics of world models trained from pixels. First, we introduce Variational Sparse Gating (VSG), a latent dynamics model that updates its feature dimensions sparsely through stochastic binary gates. Moreover, we propose a simplified architecture Simple Variational Sparse Gating (SVSG) that removes the deterministic pathway of previous models, resulting in a fully stochastic transition function that leverages the VSG mechanism. We evaluate the two model architectures in the BringBackShapes (BBS) environment that features a large number of moving objects and partial observability, demonstrating clear improvements over prior models.

artificial intelligence, arxiv preprint arxiv, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

a995960dd0193654d6b18eca4ac5b936-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 10:31:41 GMT

artificial intelligence, machine learning, tempo, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Task-aware world model learning with meta weighting via bi-level optimization

Neural Information Processing SystemsFeb-16-2026, 10:31:37 GMT

Aligning the world model with the environment for the agent's specific task is

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia > Middle East > Israel (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
(3 more...)

Add feedback

cf708fc1decf0337aded484f8f4519ae-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 06:17:17 GMT

We found that training an inverse model is crucial for learning good representations. On the first row,alevel from each environment that one-shot PPGS fails tosolve(thewhitearrowsrepresent thepolicy). Iterative Model Improvement In general settings, collecting training trajectories by sampling actions uniformly atrandom does not grant sufficient coverage ofthe state space. GLAMORGLAMOR [34] learns inverse dynamics to achieve visual goals in Atari games. The only difference withPPGS in terms of settings is that we allowGLAMORto collect data on-policy and for more interactions (2M).

artificial intelligence, trajectory, world model, (15 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.55)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.51)

Add feedback

0a97df4ce5b403ea87645010e9005130-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 09:45:45 GMT

In this paper, we investigate integrating this inductive bias of sparse interactions into thelatent dynamics ofworld models trained from pixels.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec (0.04)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

a995960dd0193654d6b18eca4ac5b936-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 04:10:21 GMT

artificial intelligence, machine learning, tempo, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

a995960dd0193654d6b18eca4ac5b936-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 04:10:18 GMT

arxiv preprint arxiv, learning, tempo, (10 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
(2 more...)

Add feedback

PrivilegedDreamer: Explicit Imagination of Privileged Information for Rapid Adaptation of Learned Policies

Byrd, Morgan, Crandell, Jackson, Das, Mili, Inman, Jessica, Wright, Robert, Ha, Sehoon

arXiv.org Artificial IntelligenceFeb-16-2025

Numerous real-world control problems involve dynamics and objectives affected by unobservable hidden parameters, ranging from autonomous driving to robotic manipulation, which cause performance degradation during sim-to-real transfer. To represent these kinds of domains, we adopt hidden-parameter Markov decision processes (HIP-MDPs), which model sequential decision problems where hidden variables parameterize transition and reward functions. Existing approaches, such as domain randomization, domain adaptation, and meta-learning, simply treat the effect of hidden parameters as additional variance and often struggle to effectively handle HIP-MDP problems, especially when the rewards are parameterized by hidden variables. We introduce Privileged-Dreamer, a model-based reinforcement learning framework that extends the existing model-based approach by incorporating an explicit parameter estimation module. PrivilegedDreamer features its novel dual recurrent architecture that explicitly estimates hidden parameters from limited historical data and enables us to condition the model, actor, and critic networks on these estimated parameters. Our empirical analysis on five diverse HIP-MDP tasks demonstrates that PrivilegedDreamer outperforms state-of-the-art model-based, model-free, and domain adaptation learning algorithms. Additionally, we conduct ablation studies to justify the inclusion of each component in the proposed architecture.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2502.11377

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

HarmonyDream: Task Harmonization Inside World Models

Ma, Haoyu, Wu, Jialong, Feng, Ningya, Xiao, Chenjun, Li, Dong, Hao, Jianye, Wang, Jianmin, Long, Mingsheng

arXiv.org Artificial IntelligenceFeb-6-2024

Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning by utilizing a world model, which models how the environment works and typically encompasses components for two tasks: observation modeling and reward modeling. In this paper, through a dedicated empirical investigation, we gain a deeper understanding of the role each task plays in world models and uncover the overlooked potential of sample-efficient MBRL by mitigating the domination of either observation or reward modeling. Our key insight is that while prevalent approaches of explicit MBRL attempt to restore abundant details of the environment via observation models, it is difficult due to the environment's complexity and limited model capacity. On the other hand, reward models, while dominating implicit MBRL and adept at learning compact task-centric dynamics, are inadequate for sample-efficient learning without richer learning signals. Motivated by these insights and discoveries, we propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization, i.e. a dynamic equilibrium between the two tasks in world model learning. Our experiments show that the base MBRL method equipped with HarmonyDream gains 10%-69% absolute performance boosts on visual robotic tasks and sets a new state-of-the-art result on the Atari 100K benchmark.

harmonydream, learning, world model, (14 more...)

arXiv.org Artificial Intelligence

2310.00344

Country: Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: