Restless-UCB, an Efficient and Low-complexity Algorithm for Online Restless Bandits Siwei Wang

Jan-26-2025, 10:06:27 GMT–Neural Information Processing Systems

We study the online restless bandit problem, where the state of each arm evolves according to a Markov chain, and the reward of pulling an arm depends on both the pulled arm and the current state of the corresponding Markov chain. In this paper, we propose Restless-UCB, a learning policy that follows the explore-then-commit framework. In Restless-UCB, we present a novel method to construct offline instances, which only requires O(N) time-complexity (N is the number of arms) and is exponentially better than the complexity of existing learning policy.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Jan-26-2025, 10:06:27 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.28)
- North America (0.28)

Genre:
- Research Report > Promising Solution (0.48)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.84)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (0.70)
    - Representation & Reasoning (1.00)
  - Communications > Networks (0.93)
  - Data Science > Data Mining
    - Big Data (0.50)

Duplicate Docs Excel Report

Title
89ae0fe22c47d374bc9350ef99e01685-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found