Exploration-free Algorithms for Multi-group Mean Estimation

Wei, Ziyi, Zhong, Huaiyang, Li, Xiaocheng

arXiv.org Machine Learning 

We study the problem of multi-group mean estimation, where the task is to allocate a limited sampling budget across multiple groups in order to estimate their means uniformly well. This problem arises naturally in polling, survey design, marketing, and other settings where representative estimates across diverse groups are required. A key feature distinguishing this setting from classical reward-maximization bandits is that the optimal allocation requires sampling every arm on the order of Θ(T) times, rather than focusing as much as possible on the best option. This structural property suggests that explicit exploration phases are unnecessary and opens the door to exploration-free algorithms. Contextual information makes the problem even more relevant in real-world applications such as healthcare (Bastani and Bayati, 2020; Du et al., 2024), recommendation systems (Agarwal et al., 2009; Li et al., 2010), and dynamic pricing (Qiang and Bayati, 2016; Ban and Keskin, 2021), where side information fundamentally shapes the reward distributions and motivates the estimation of context-dependent group parameters. Accurate estimation in this richer setting is crucial for interpretable personalization, robust policy design, and fairness considerations.