Federated Multi-armed Bandits with Efficient Bit-Level Communications
–Neural Information Processing Systems
In this work, we study the federated multi-armed bandit (FMAB) problem, where a set of agents collaboratively aim to minimize cumulative regret. Unlike traditional centralized bandit models, agents in FMAB settings are connected via a communication graph and cannot share data freely due to bandwidth limitations or privacy constraints. This raises a fundamental challenge: how to achieve optimal learning performance under stringent communication budgets. We propose a novel communication-efficient algorithm containing two points: one for eliminating suboptimal arms through early and frequent communication of key decisions, and the other for refining global estimates using incremental epoch, quantized, and differentially transmitted statistics. Incremental Epoch-based Successive Elimination Algorithm (EpoInc-SE) is presented by carefully balancing communication frequency and precision of global estimates. Theoretically, we derive tight upper bounds on both individual cumulative regret and group regret, and prove that our method asymptotically matches the lower bound of regret in federated settings.
Neural Information Processing Systems
Jun-23-2026, 00:15:38 GMT
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Information Technology (0.67)
- Education (0.46)
- Technology: