Balancing Performance and Costs in Best Arm Identification

Jun-21-2026, 22:44:32 GMT–Neural Information Processing Systems

We consider the problem of identifying the best arm in a multi-armed bandit model. Despite a wealth of literature in the traditional fixed budget and fixed confidence regimes of the best arm identification problem, it still remains a mystery to most practitioners as to how to choose an approach and corresponding budget or confidence parameter. We propose a new formalism to avoid this dilemma altogether by minimizing a risk functional which explicitly balances the performance of the recommended arm and the cost incurred by learning this arm. In this framework, a cost is incurred for each observation during the sampling phase, and upon recommending an arm, a performance penalty is incurred for identifying a suboptimal arm. The learner's goal is to minimize the sum of the penalty and cost. This new regime mirrors the priorities of many practitioners, e.g.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Jun-21-2026, 22:44:32 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Health & Medicine
  - Therapeutic Area (0.46)
  - Pharmaceuticals & Biotechnology (0.46)

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.49)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found