Playing in the Dark: No-regret Learning with Adversarial Constraints

Oct-29-2023–arXiv.org Artificial Intelligence

We study a generalization of the classic Online Convex Optimization (OCO) framework by considering additional long-term adversarial constraints. Specifically, after an online policy decides its action on a round, in addition to a convex cost function, the adversary also reveals a set of $k$ convex constraints. The cost and the constraint functions could change arbitrarily with time, and no information about the future functions is assumed to be available. In this paper, we propose a meta-policy that simultaneously achieves a sublinear cumulative constraint violation and a sublinear regret. This is achieved via a black box reduction of the constrained problem to the standard OCO problem for a recursively constructed sequence of surrogate cost functions. We show that optimal performance bounds can be achieved by solving the surrogate problem using any adaptive OCO policy enjoying a standard data-dependent regret bound. A new Lyapunov-based proof technique is presented that reveals a connection between regret and certain sequential inequalities through a novel decomposition result. We conclude the paper by highlighting applications to online multi-task learning and network control problems.

constraint, constraint function, violation, (16 more...)

arXiv.org Artificial Intelligence

Oct-29-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Jersey > Mercer County > Princeton (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > India
  - West Bengal > Kolkata (0.04)
  - Maharashtra > Mumbai (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Constraint-Based Reasoning (0.91)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found