curriculum
- Health & Medicine > Therapeutic Area > Neurology (0.94)
- Education (0.93)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
926ffc0ca56636b9e73c565cf994ea5a-AuthorFeedback.pdf
We thank the reviewers for their valuable comments. We are glad that reviewers noted our paper as novel (R1: "idea is "Decouple the effect of capacity increase and curriculum learning": We would like to We will also move related works section as suggested. We agree that this issue is important in the field of curriculum learning. "It could be interesting to show results on the large W ebVision Benchmark": "W ould proposed curriculum change robustness to adversarial attacks": On average, our method requires 20 % fewer epochs. ImageNet, we conducted new experiments on WebVision dataset (2.3 million training images) and obtain significant Please see the first table above.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
06d5ae105ea1bea4d800bc96491876e9-AuthorFeedback.pdf
We thank all the reviewers for the constructive comments. We address the major concerns below. Reproducibility: 1) learning to draft details; 2) feature details; 3) discussions on the computing resources used. The search tree is updated based on four steps of MCTS. The learning rate is set to 0.001 with Adam.
Learning Versatile Skills with Curriculum Masking
Masked prediction has emerged as a promising pretraining paradigm in offline reinforcement learning (RL) due to its versatile masking schemes, enabling flexible inference across various downstream tasks with a unified model. Despite the versatility of masked prediction, it remains unclear how to balance the learning of skills at different levels of complexity. To address this, we propose CurrMask, a curriculum masking pretraining paradigm for sequential decision making. Motivated by how humans learn by organizing knowledge in a curriculum, CurrMask adjusts its masking scheme during pretraining for learning versatile skills. Through extensive experiments, we show that CurrMask exhibits superior zero-shot performance on skill prompting tasks, goal-conditioned planning tasks, and competitive finetuning performance on offline RL tasks. Additionally, our analysis of training dynamics reveals that CurrMask gradually acquires skills of varying complexity by dynamically adjusting its masking scheme.
Unsupervised Curricula for Visual Meta-Reinforcement Learning
In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast and effective reinforcement learning (RL) strategies. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can ``useful'' pre-training tasks be discovered in an unsupervised manner? We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. The task distribution is scaffolded by a parametric density model of the meta-learner's trajectory distribution. We formulate unsupervised meta-RL as information maximization between a latent task variable and the meta-learner's data distribution, and describe a practical instantiation which alternates between integration of recent experience into the task distribution and meta-learning of the updated tasks. Repeating this procedure leads to iterative reorganization such that the curriculum adapts as the meta-learner's data distribution shifts. Moreover, we show how discriminative clustering frameworks for visual representations can support trajectory-level task acquisition and exploration in domains with pixel observations, avoiding the pitfalls of alternatives. In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that both transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient meta-learning of test task distributions.