The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle

Dec-25-2025, 02:00:28 GMT–Neural Information Processing Systems

Thompson sampling (TS) has attracted a lot of interest in the bandit area. It was introduced in the 1930s but has not been theoretically proven until recent years. All of its analysis in the combinatorial multi-armed bandit (CMAB) setting requires an exact oracle to provide optimal solutions with any input. However, such an oracle is usually not feasible since many combinatorial optimization problems are NP-hard and only approximation oracles are available. An example \cite{WangC18} has shown the failure of TS to learn with an approximation oracle. However, this oracle is uncommon and is designed only for a specific problem instance.

approximation oracle, combinatorial semi-bandit, oracle, (9 more...)

Neural Information Processing Systems

Dec-25-2025, 02:00:28 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.62)