A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback

Aug-14-2025, 17:12:30 GMT–Neural Information Processing Systems

We present a modified tuning of the algorithm of Zimmert and Seldin [2020] for adversarial multiarmed bandits with delayed feedback, which in addition to the minimax optimal adversarial regret guarantee shown by Zim-mert and Seldin simultaneously achieves a near-optimal regret guarantee in the stochastic setting with fixed delays.

algorithm, bandit, regularizer, (17 more...)

Neural Information Processing Systems

Aug-14-2025, 17:12:30 GMT

Conferences PDF

Add feedback

Country:
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
ABest-of-Both-WorldsAlgorithmforBanditswith DelayedFeedback

Similar Docs Excel Report more

Title	Similarity	Source
None found