Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Bertsimas, Dimitris, Kim, Cheol Woo, Niño-Mora, José

arXiv.org Artificial Intelligence 

We propose a machine learning approach to the optimal control of fluid restless multi-armed bandits (FRMABs) with state equations that are either affine or quadratic in the state variables. By deriving fundamental properties of FRMAB problems, we design an efficient machine learning based algorithm. Using this algorithm, we solve multiple instances with varying initial states to generate a comprehensive training set. We then learn a state feedback policy using Optimal Classification Trees with hyperplane splits (OCT-H). We test our approach on machine maintenance, epidemic control and fisheries control problems. Our method yields high-quality state feedback policies and achieves a speed-up of up to 26 million times compared to a direct numerical algorithm for fluid problems.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found