Efficient Batched Algorithm for Contextual Linear Bandits with Large Action Space via Soft Elimination

Neural Information Processing Systems 

Unlike existing batched algorithms that rely on action elimination, which are not implementable for large action sets, our algorithm only uses a linear optimization oracle over the action set to design the policy.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found