Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity
–Neural Information Processing Systems
The wider application of end-to-end learning methods to embodied decision-making domains remains bottlenecked by their reliance on a superabundance of training data representative of the target domain.Meta-reinforcement learning (meta-RL) approaches abandon the aim of zero-shot --the goal of standard reinforcement learning (RL)--in favor of few-shot, and thus hold promise for bridging larger generalization gaps.While learning this meta-level adaptive behavior still requires substantial data, efficient environment simulators approaching real-world complexity are growing in prevalence.Even so, hand-designing sufficiently diverse and numerous simulated training tasks for these complex domains is prohibitively labor-intensive.Domain randomization (DR) and procedural generation (PG), offered as solutions to this problem, require simulators to possess carefully-defined parameters which directly translate to meaningful task diversity--a similarly prohibitive assumption.In this work, we present DIVA
Neural Information Processing Systems
Mar-21-2026, 13:15:33 GMT
- Technology: