Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification

Dec-24-2025, 15:21:47 GMT–Neural Information Processing Systems

In linear bandits, how can a learner effectively learn when facing corrupted rewards? While significant work has explored this question, a holistic understanding across different adversarial models and corruption measures is lacking, as is a full characterization of the minimax regret bounds. In this work, we compare two types of corruptions commonly considered: strong corruption, where the corruption level depends on the learner's chosen action, and weak corruption, where the corruption level does not depend on the learner's chosen action. We provide a unified framework to analyze these corruptions. For stochastic linear bandits, we fully characterize the gap between the minimax regret under strong and weak corruptions.

artificial intelligence, gap-dependent misspecification, machine learning, (8 more...)

Neural Information Processing Systems

Dec-24-2025, 15:21:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Search (0.88)
  - Machine Learning (0.76)