Combinatorial Pure Exploration of Multi-Armed Bandits

Shouyuan Chen, Tian Lin, Irwin King, Michael R. Lyu, Wei Chen

Feb-9-2025, 23:18:48 GMT–Neural Information Processing Systems

Multi-armed bandit (MAB) is a predominant model for characterizing the tradeoff between exploration and exploitation in decision-making problems. Although this is an intrinsic tradeoff in many tasks, some application domains prefer a dedicated exploration procedure in which the goal is to identify an optimal object among a collection of candidates and the reward or loss incurred during exploration is irrelevant. In light of these applications, the related learning problem, called pure exploration in MABs, has received much attention. Recent advances in pure exploration MABs have found potential applications in many domains including crowdsourcing, communication network and online advertising. In many of these application domains, a recurring problem is to identify the optimal object with certain combinatorial structure.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Feb-9-2025, 23:18:48 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.47)