AITopics | montezuma

10a6bdcabbd5a3d36b760daa295f63c1-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:28 GMT

machine learning, natural language, reinforcement learning, (15 more...)

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Games (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Tejas D. Kulkarni, Karthik Narasimhan, Ardavan Saeedi, Josh Tenenbaum

Neural Information Processing SystemsApr-22-2026, 13:44:51 GMT

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchicalDQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning. A top-level q-value function learns a policy over intrinsic goals, while a lower-level function learns a policy over atomic actions to satisfy the given goals.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL Chen Sun

Neural Information Processing SystemsFeb-12-2026, 18:26:03 GMT

In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Playing hard exploration games by watching YouTube

Yusuf Aytar, Tobias Pfaff, David Budden, Thomas Paine, Ziyu Wang, Nando de Freitas

Neural Information Processing SystemsFeb-12-2026, 14:51:18 GMT

Deep reinforcement learning methods traditionally struggle withtaskswhere environment rewards are particularly sparse.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

3b712de48137572f3849aabd5666a4e3-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 23:07:23 GMT

artificial intelligence, landmark, training step, (18 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.30)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.33)

Add feedback

Parametrically Seek

Neural Information Processing SystemsFeb-11-2026, 22:42:24 GMT

artificial intelligence, pmax, turneretal, (15 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology: Information Technology > Artificial Intelligence (0.96)

Add feedback

Entropic Desired Dynamics for Intrinsic Control: Supplemental Material Steven Hansen

Neural Information Processing SystemsFeb-8-2026, 22:46:44 GMT

While this is not close to the state-of-the-art in general (c.f. Figure 2 shows the effect of action entropy on exploratory behavior in Montezuma's Revenge. Number of unique avatar positions visited. Full training curves across all 6 Atari games are shown in Figure 1, including the random policy baseline. To ensure this didn't hamper performance, we At each state visited by the agent evaluator during training, the agent's state (consisting of the avatar's The full curves are included for completeness. The compute cluster we performed experiments on is heterogenous, and has features such as host-sharing, adaptive load-balancing, etc.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Add feedback