Bandit Guided Submodular Curriculum for Adaptive Subset Selection
–Neural Information Processing Systems
Traditional curriculum learning proceeds from easy to hard samples, yet defining a reliable notion of difficulty remains elusive. Prior work has used submodular functions to induce difficulty scores in curriculum learning. We reinterpret adaptive subset selection and formulate it as a multi-armed bandit problem, where each arm corresponds to a submodular function guiding sample selection. We introduce ONLINESUBMOD, a novel online greedy policy that optimizes a utility-driven reward and provably achieves no-regret performance under various sampling regimes. Empirically, ONLINESUBMOD outperforms both traditional curriculum learning and bi-level optimization approaches across vision and language datasets, showing superior accuracy-efficiency tradeoffs. More broadly, we show that validationdriven reward metrics offer a principled way to guide the curriculum schedule. Our code is publicly available at GitHub 2.
Neural Information Processing Systems
Jun-21-2026, 09:17:36 GMT
- Country:
- Asia (0.27)
- Genre:
- Overview (0.92)
- Instructional Material > Course Syllabus & Notes (0.64)
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Industry:
- Education > Curriculum > Subject-Specific Education (0.67)
- Technology:
- Information Technology
- Data Science > Data Mining
- Big Data (0.88)
- Artificial Intelligence
- Vision (1.00)
- Natural Language (1.00)
- Representation & Reasoning > Optimization (0.65)
- Machine Learning
- Statistical Learning (0.93)
- Neural Networks > Deep Learning (0.46)
- Data Science > Data Mining
- Information Technology