1 2 Preliminaries 3 3 The Technical Workhorses 4 3.2 A Volumetric Lemma 5 4 Warmup with Linear Classification 6 4.1 Smoothed classification via the Perceptron algorithm 7 5 Beyond the Linear Case

Neural Information Processing Systems 

In this section, we apply Theorem 12 and the approach of Foster and Rakhlin [2020] to the setting of contextual bandits with contexts drawn from a smooth distribution, considered in Block et al. [2022]. Unlike in that work, however, we will realize regret bounds achievable by an oracle-efficient algorithm that are polynomially improved both in the horizon and the number of actions in the particular case of noiseless rewards that are piecewise linear.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found