On Kernelized Multi-Armed Bandits with Constraints

Oct-9-2024, 08:36:35 GMT–Neural Information Processing Systems

We study a stochastic bandit problem with a general unknown reward function and a general unknown constraint function. Both functions can be non-linear (even non-convex) and are assumed to lie in a reproducing kernel Hilbert space (RKHS) with a bounded norm. In contrast to safety-type hard constraints studied in prior works, we consider soft constraints that may be violated in any round as long as the cumulative violations are small, which is motivated by various practical applications. Our ultimate goal is to study how to utilize the nature of soft constraints to attain a finer complexity-regret-constraint trade-off in the kernelized bandit setting. To this end, leveraging primal-dual optimization, we propose a general framework for both algorithm design and performance analysis.

constraint, kernelized multi-armed bandit, soft constraint, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 08:36:35 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Data Science > Data Mining
    - Big Data (0.77)
  - Artificial Intelligence > Representation & Reasoning
    - Constraint-Based Reasoning (1.00)