Optimal Regret of Bandits under Differential Privacy

Jun-13-2026, 17:15:24 GMT–Neural Information Processing Systems

As sequential learning algorithms are increasingly applied to real life, ensuring data privacy while maintaining their utilities emerges as a timely question. In this context, regret minimisation in stochastic bandits under $\epsilon$-global Differential Privacy (DP) has been widely studied. The present literature poses a significant gap between the best-known regret lower and upper bound in this setting, though they ``match in order''. Thus, we revisit the regret lower and upper bounds of $\epsilon$-global DP bandits and improve both. First, we prove a tighter regret lower bound involving a novel information-theoretic quantity characterising the hardness of $\epsilon$-global DP in stochastic bandits.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Jun-13-2026, 17:15:24 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (0.59)

Technology:
- Information Technology
  - Security & Privacy (0.59)
  - Artificial Intelligence > Machine Learning (0.39)