Optimal Data Driven Resource Allocation under Multi-Armed Bandit Observations

Burnetas, Apostolos N., Kanavetas, Odysseas, Katehakis, Michael N.

Dec-13-2018–arXiv.org Machine Learning

Consider the problem of sequentially activating one of a finite number of independent bandits, where each activation of a bandit incurs a number of bandit dependent resource utilizations, or activation costs. For each resource type the constraint insures that the total resource utilized (or equivalently cost incurred) at any time does not exceed the current resource availability (budget). It is assumed that following each activation any unused resource amounts can be carried forward for use in future activations. We also make the assumption that successive activations of each bandit yield independent, among different bandits, identically distributed (iid) random rewards with positive means, and distributions that depend on unknown parameters. The objective isto obtain a feasible policy that maximizes asymptotically the total expect rewards or equivalently, minimizes asymptotically a regret function. We develop a class of feasible policies that are shown to be asymptotically optimal within a large class of good policies that uniformly fast (UF) convergent, in the sense of Burnetas and Katehakis (1996) and Lai and Robbins (1985). The results in this paper extend the work in Burnetas et al. (2017) which solved the case where there exists only one type of constraint for all bandits.

artificial intelligence, bandit, big data, (20 more...)

arXiv.org Machine Learning

Dec-13-2018

arXiv.org PDF

Add feedback

Country:
- Europe > Greece (0.14)
- North America > United States (0.46)

Genre:
- Research Report (0.40)

Industry:
- Energy > Oil & Gas (0.66)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found