Replay-Guided Adversarial Environment Design

Oct-9-2024, 12:48:50 GMT–Neural Information Processing Systems

Deep reinforcement learning (RL) agents may successfully generalize to new settings if trained on an appropriately diverse set of environment and task configurations. Unsupervised Environment Design (UED) is a promising self-supervised RL paradigm, wherein the free parameters of an underspecified environment are automatically adapted during training to the agent's capabilities, leading to the emergence of diverse training environments. Here, we cast Prioritized Level Replay (PLR), an empirically successful but theoretically unmotivated method that selectively samples randomly-generated training levels, as UED. We argue that by curating completely random levels, PLR, too, can generate novel and complex levels for effective training. This insight reveals a natural class of UED methods we call Dual Curriculum Design (DCD).

artificial intelligence, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Oct-9-2024, 12:48:50 GMT

Conferences Web Page

Add feedback

Industry:
- Education (0.61)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)