Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees

Neural Information Processing Systems 

While the primary concern in this domain is usually to find realizable representations (i.e., those that allow predicting the reward function at any context-action

Similar Docs  Excel Report  more

TitleSimilaritySource
None found