an Efficient Bandit algorithm for Online Multiclass Prediction
–Neural Information Processing Systems
We present an efficient algorithm for the problem of online multiclass prediction with bandit feedback in the fully adversarial setting. We measure its regret with respect to the log-loss defined in [AR09], which is parameterized by a scalar α.
Neural Information Processing Systems
Mar-15-2024, 14:57:11 GMT