Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games

Apr-21-2026, 22:09:33 GMT–Neural Information Processing Systems

Partial monitoring games are repeated games where the learner receives feedback that might be different from adversary's move or even the reward gained by the learner. Recently, a general model of combinatorial partial monitoring (CPM) games was proposed [1], where the learner's action space can be exponentially large and adversary samples its moves from a bounded, continuous space, according to a fixed distribution. The paper gave a confidence bound based algorithm (GCB) that achieves O(T2/3 log T) distribution independent and O(log T) distribution dependent regret bounds. The implementation of their algorithm depends on two separate offline oracles and the distribution dependent regret additionally requires existence of a unique optimal action for the learner. Adopting their CPM model, our first contribution is a Phased Exploration with Greedy Exploitation (PEGE) algorithmic framework for the problem.

artificial intelligence, learner, machine learning, (16 more...)

Neural Information Processing Systems

Apr-21-2026, 22:09:33 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > Michigan (0.14)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science (0.94)

Duplicate Docs Excel Report

Title
Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games
Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games

Similar Docs Excel Report more

Title	Similarity	Source
None found