Queue Up Your Regrets: Achieving the Dynamic Capacity Region of Multiplayer Bandits

Apr-24-2026, 09:17:13 GMT–Neural Information Processing Systems

Consider N cooperative agents such that for T turns, each agent n takes an action an and receives a stochastic reward rn (a1,...,aN). Agents cannot observe the actions of other agents and do not know even their own reward function. The agents can communicate with their neighbors on a connected graph Gwith diameter d(G). We want each agent nto achieve an expected average reward of at least λn over time, for a given quality of service (QoS) vector λ. AQoS vector λis not necessarily achievable.

agent, algorithm, artificial intelligence, (15 more...)

Neural Information Processing Systems

Apr-24-2026, 09:17:13 GMT

Conferences PDF

Add feedback

Genre:
- Research Report (0.46)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Duplicate Docs Excel Report

Title
QueueUpYourRegrets: AchievingtheDynamic CapacityRegionofMultiplayerBandits

Similar Docs Excel Report more

Title	Similarity	Source
None found