Model Selection for Contextual Bandits
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo
–Neural Information Processing Systems
We introduce the problem of model selection for contextual bandits, where a learner must adapt to the complexity of the optimal policy while balancing exploration and exploitation. Our main result is a new model selection guarantee for linear contextual bandits.
Neural Information Processing Systems
Jan-23-2025, 07:07:22 GMT
- Country:
- North America > United States (0.46)
- Industry:
- Education > Educational Setting (0.47)
- Technology: