Optimal Regret of Bandits under Differential Privacy
–Neural Information Processing Systems
As sequential learning algorithms are increasingly applied to real life, ensuring data privacy while maintaining their utilities emerges as a timely question. In this context, regret minimisation in stochastic bandits under ϵ-global Differential Privacy (DP) has been widely studied. The present literature poses a significant gap between the best-known regret lower and upper bound in this setting, though they "match in order". Thus, we revisit the regret lower and upper bounds of ϵ-global DP bandits and improve both. First, we prove a tighter regret lower bound involving a novel information-theoretic quantity characterising the hardness of ϵ-global DP in stochastic bandits.
Neural Information Processing Systems
Jun-21-2026, 09:28:18 GMT
- Country:
- Asia (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Technology:
- Information Technology
- Security & Privacy (1.00)
- Data Science > Data Mining (0.93)
- Communications (0.93)
- Artificial Intelligence
- Representation & Reasoning (1.00)
- Machine Learning (1.00)
- Natural Language (0.93)
- Information Technology