Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification

Neural Information Processing Systems 

In linear bandits, how can a learner effectively learn when facing corrupted rewards?

Similar Docs  Excel Report  more

TitleSimilaritySource
None found