A Best-of-Both-Worlds Proof for Tsallis-INF without Fenchel Conjugates

Nov-17-2025–arXiv.org Machine Learning

The multi-armed bandit problem is a classic framework for sequential decision-making under uncertainty, encapsulating the fundamental trade-off between exploration and exploitation. This problem is typically studied in two primary settings: the stochastic setting, where the loss for each arm is drawn independently and identically from a fixed distribution, and the adversarial setting, where an arbitrary and potentially adaptive sequence of losses is generated. While numerous algorithms have been developed that are optimal for one of these settings, a significant challenge has been to design a single algorithm that performs well in both, without prior knowledge of the environment. Such an algorithm is said to have a "best-of-both-worlds" guarantee. By choosing an appropriate regularization function, a Follow-The-Regularized-Leader (FTRL) [Gordon, 1999a,b, Shalev-Shwartz and Singer, 2006a,b, Abernethy et al., 2008, Hazan and Kale, 2008] approach can achieve low regret in the adversarial setting while adapting to the easier stochastic setting to attain near-optimal, logarithmically-scaling regret, that is, a best-of-both-worlds guarantee. The Tsallis-INF algorithm [Audibert and Bubeck, 2009], when used in a FTRL algorithm was shown to have this property by Zimmert and Seldin [2021]. The aim of this note is to provide a concise and self-contained proof of this guarantee for the Tsallis-INF algorithm. In particular, the derivation relies on standard and modern techniques from online convex optimization, particularly the FTRL regret lemma and its local norm analysis.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

Nov-17-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > Finland
  - Uusimaa > Helsinki (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found