Low-Complexity Algorithm for Restless Bandits with Imperfect Observations

Liu, Keqin, Weber, Richard, Wu, Ting, Zhang, Chengzhong

Aug-9-2022–arXiv.org Artificial Intelligence

We consider a class of restless bandit problems that finds a broad application area in stochastic optimization, reinforcement learning and operations research. We consider $N$ independent discrete-time Markov processes, each of which had two possible states: 1 and 0 (`good' and `bad'). Only if a process is both in state 1 and observed to be so does reward accrue. The aim is to maximize the expected discounted sum of returns over the infinite horizon subject to a constraint that only $M$ $(

equation, indexability, monotonically, (16 more...)

arXiv.org Artificial Intelligence

Aug-9-2022

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Jiangsu Province > Nanjing (0.04)

Genre:
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found