Review for NeurIPS paper: Self-Paced Deep Reinforcement Learning

Jan-25-2025, 06:54:32 GMT–Neural Information Processing Systems

Summary and Contributions: After reading the authors response, I've updated my score from (4) to (5). A fixed set of curriculum tasks is given, and the algorithm can sample tasks from the set at every step. The hope is that by smartly and adaptively selecting the tasks, we can speed up learning. The final goal is to maximize performance with respect to a fixed target distribution over tasks (which is known). The proposed algorithm alternates two types of steps: policy improving for a fixed task (or "context") distribution, and "task distribution adjustment" for a fixed policy.

algorithm, self-paced deep reinforcement learning, target distribution, (6 more...)

Neural Information Processing Systems

Jan-25-2025, 06:54:32 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)