From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Xu, Jiamin, Nazarov, Ivan, Rastogi, Aditya, Periáñez, África, Gan, Kyra

Feb-7-2025–arXiv.org Artificial Intelligence

Online restless bandits extend classic contextual bandits by incorporating state transitions and budget constraints, representing each agent as a Markov Decision Process (MDP). This framework is crucial for finite-horizon strategic resource allocation, optimizing limited costly interventions for long-term benefits. However, learning the underlying MDP for each agent poses a major challenge in finite-horizon settings. To facilitate learning, we reformulate the problem as a scalable budgeted thresholding contextual bandit problem, carefully integrating the state transitions into the reward design and focusing on identifying agents with action benefits exceeding a threshold. We establish the optimality of an oracle greedy solution in a simple two-state setting, and propose an algorithm that achieves minimax optimal constant regret in the online multi-state setting with heterogeneous agents and knowledge of outcomes under no intervention. We numerically show that our algorithm outperforms existing online restless bandit methods, offering significant improvements in finite-horizon performance.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Feb-7-2025

arXiv.org PDF

Add feedback

Country:
- Africa (0.04)
- North America > United States
  - New York > Tompkins County > Ithaca (0.04)
- Europe > Spain
  - Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Health & Medicine > Health Care Providers & Services (0.45)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Agents (1.00)
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found