Multi-player Multi-Armed Bandits with non-zero rewards on collisions for uncoordinated spectrum access

Magesh, Akshayaa, Veeravalli, Venugopal V.

Oct-20-2019–arXiv.org Machine Learning

ABSTRACT In this paper, we study the uncoordinated spectrum access problem using the multi-player multi-armed bandits framework. W e consider a model where there is no central control and the users cannot communicate with each other. The environment may appear differently to different users, i.e., the mean rewards as seen by different users for a particular channel may be different. Additionally, in case of a collisi on, we allow for the colliding users to receive nonzero rewards . Index T erms-- multi-armed bandits, uncoordinated spectrum access 1. INTRODUCTION Multi-player multi-armed bandit models have been widely used to study the spectrum access problem [1-9], where there are multiple users vying for a set of channels.

action profile, algorithm, mean reward, (14 more...)

arXiv.org Machine Learning

Oct-20-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > Illinois (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (0.95)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found