An Asymptotically Optimal Algorithm for the Convex Hull Membership Problem
Multi-armed bandit (MAB) problem has been increasingly recognized as a fundamental and powerful framework in decision theory, where an agent's objective is to make a series of decisions to pull an arm of a K slot machine in order to maximize the total reward. Each of the arms is associated with a fixed but unknown probability distribution [5, 31]. An enormous literature has accumulated over the past decades on the MAB problem, such as clinical trials and drug testing [6, 19], recommendation system and online advertising [7, 9, 42, 51, 56], information retrieval [8, 37], and finance [24, 39, 40, 48]. From a theoretical perspective, the MAB problem was first studied in the seminal work [44] and followed by a vast line of work to study in regret minimization [2, 4, 5, 10, 14, 18, 32, 34, 49, 53] and pure exploration [11, 22, 36, 46]. In this paper, we investigate the convex hull membership (CHM) problem in a pure exploration setting, where a learner sequentially performs experiments in a stochastic multi-armed bandit environment to identify if a given point lies in the convex hull of means of K arms as efficiently and accurately as possible. In particular, we are interested in the minimum expected number of samples required to identify the convex hull membership with high probability (at least 1 δ), without considering a reward/cost structure. The non-stochastic version of the convex hull membership problem is well studied in [20] and has attracted significant attention in different scientific areas and proven its crucial applications in image processing [25, 55], robot motion planning [33, 50] and pattern recognition [27, 45].
Sep-26-2023
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning (1.00)
- Representation & Reasoning (1.00)
- Robots (1.00)
- Data Science > Data Mining
- Big Data (0.87)
- Artificial Intelligence
- Information Technology