Efficient Batched Algorithm for Contextual Linear Bandits with Large Action Space via Soft Elimination
–Neural Information Processing Systems
In this paper, we provide the first efficient batched algorithm for contextual linear bandits with large action spaces. Unlike existing batched algorithms that rely on action elimination, which are not implementable for large action sets, our algorithm only uses a linear optimization oracle over the action set to design the policy. The proposed algorithm achieves a regret upper bound Õ( T) with high probability, and uses O(log log T) batches, matching the lower bound on the number of batches [13].
Neural Information Processing Systems
May-25-2025, 09:08:39 GMT
- Country:
- Asia > Middle East
- Qatar (0.14)
- North America > United States
- California (0.14)
- Asia > Middle East
- Genre:
- Research Report (0.46)
- Industry:
- Health & Medicine (0.67)
- Technology: