Bandit Learning in Concave N-Person Games

Mario Bravo, David Leslie, Panayotis Mertikopoulos

Feb-12-2026, 17:47:03 GMT–Neural Information Processing Systems

The bane of decision-making in an unknown environment isregret: noone wants to realize in hindsight that the decision policytheyemployed was strictly inferior toaplain policyprescribing the same action throughout. For obvious reasons, this issue becomes considerably more intricate when the decision-makerissubject tosituational uncertainty and the "fog ofwar": when the only information at the optimizer's disposal is the reward obtained from a given action (the so-called "bandit" framework), is it even possible to design a no-regret policy?

algorithm, game theory, mertikopoulo, (16 more...)

Neural Information Processing Systems

Feb-12-2026, 17:47:03 GMT

Conferences PDF

Add feedback

Country:
- Europe > France (0.04)
- North America
  - United States (0.04)
  - Canada > Quebec
    - Montreal (0.04)

Technology:
- Information Technology > Game Theory (1.00)

Duplicate Docs Excel Report

Title
Bandit Learning in Concave N-Person Games
Bandit Learning in Concave N-Person Games

Similar Docs Excel Report more

Title	Similarity	Source
None found