Multi-Armed Bandits with Metric Movement Costs

Oct-8-2024, 07:47:15 GMT–Neural Information Processing Systems

We consider the non-stochastic Multi-Armed Bandit problem in a setting where there is a fixed and known metric on the action space that determines a cost for switching between any pair of actions. The loss of the online learner has two components: the first is the usual loss of the selected actions, and the second is an additional loss due to switching between actions. Our main contribution gives a tight characterization of the expected minimax regret in this setting, in terms of a complexity measure C of the underlying metric which depends on its covering numbers.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Oct-8-2024, 07:47:15 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.68)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science > Data Mining
    - Big Data (1.00)

Duplicate Docs Excel Report

Title
Multi-Armed Bandits with Metric Movement Costs
Multi-Armed Bandits with Metric Movement Costs

Similar Docs Excel Report more

Title	Similarity	Source
None found