AutomaticCurriculumLearningthrough ValueDisagreement
–Neural Information Processing Systems
Through reinforcement learning (RL), we have made massive strides towards solving tasks that haveasingle goal. However,inthe multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affectsampleefficiency. Whenbiologicalagentslearn,thereisoftenanorganized and meaningful order to which learning happens. Inspired by this, we propose setting up an automatic curriculum for goals that the agent needs to solve.
Neural Information Processing Systems
Feb-8-2026, 11:55:52 GMT