Multi-player Multi-Armed Bandits with non-zero rewards on collisions for uncoordinated spectrum access
Magesh, Akshayaa, Veeravalli, Venugopal V.
ABSTRACT In this paper, we study the uncoordinated spectrum access problem using the multi-player multi-armed bandits framework. W e consider a model where there is no central control and the users cannot communicate with each other. The environment may appear differently to different users, i.e., the mean rewards as seen by different users for a particular channel may be different. Additionally, in case of a collisi on, we allow for the colliding users to receive nonzero rewards . Index T erms-- multi-armed bandits, uncoordinated spectrum access 1. INTRODUCTION Multi-player multi-armed bandit models have been widely used to study the spectrum access problem [1-9], where there are multiple users vying for a set of channels.
Oct-20-2019