On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

Zhang, Yixuan, Zhu, Ruihao, Xie, Qiaomin

Mar-20-2026–arXiv.org Machine Learning

Motivated by the principle of satisficing in decision-making, we study satisficing regret guarantees for nonstationary $K$-armed bandits. We show that in the general realizable, piecewise-stationary setting with $L$ stationary segments, the optimal regret is $Θ(L\log T)$ as long as $L\geq 2$. This stands in sharp contrast to the case of $L=1$ (i.e., the stationary setting), where a $T$-independent $Θ(1)$ satisficing regret is achievable under realizability. In other words, the optimal regret has to scale with $T$ even if just a little nonstationarity presents. A key ingredient in our analysis is a novel Fano-based framework tailored to nonstationary bandits via a \emph{post-interaction reference} construction. This framework strictly extends the classical Fano method for passive estimation as well as recent interactive Fano techniques for stationary bandits. As a complement, we also discuss a special regime in which constant satisficing regret is again possible.

bandit, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

Mar-20-2026

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Wisconsin > Dane County > Madison (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.47)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found