ID policy (with reassignment) is asymptotically optimal for heterogeneous weakly-coupled MDPs

Zhang, Xiangcheng, Hong, Yige, Wang, Weina

Feb-9-2025–arXiv.org Artificial Intelligence

Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the fully heterogeneous setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). Each WCMDP consists of $N$ arms (or subproblems), which have distinct model parameters in the fully heterogeneous setting, leading to the curse of dimensionality when $N$ is large. We show that, under mild assumptions, a natural adaptation of the ID policy, although originally proposed for a homogeneous special case of WCMDPs, in fact achieves an $O(1/\sqrt{N})$ optimality gap in long-run average reward per arm for fully heterogeneous WCMDPs as $N$ becomes large. This is the first asymptotic optimality result for fully heterogeneous average-reward WCMDPs. Our techniques highlight the construction of a novel projection-based Lyapunov function, which witnesses the convergence of rewards and costs to an optimal region in the presence of heterogeneity.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Feb-9-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts (0.04)
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)

Genre:
- Overview (0.92)

Industry:
- Health & Medicine (0.67)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Optimization (0.46)
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found