Appendix A Lower Bound In this section, we establish a lower bound on the expected regret of any algorithm for our multi-agent

Neural Information Processing Systems 

Our goal in this section is twofold. To avoid excessive repetition of notation and proof arguments, we purposefully leave this section not self-contained and only outline these adjustments needed. We refer an interested reader to the work of Auer et al. We leave it to future work to optimize the dependence on N and K . In our MA-MAB problem, an algorithm is allowed to "pull" a distribution over the arms In the proof of their Lemma A.1, in the explanation of their Equation (30), they cite the assumption that given the rewards observed in the first Finally, in the proof of their Theorem A.2, they again consider the probability Thus, we have the following lower bound.