Learning Near-Pareto-Optimal Conventions in Polynomial Time

Wang, Xiaofeng, Sandholm, Tuomas

Neural Information Processing Systems 

We study how to learn to play a Pareto-optimal strict Nash equilibrium when there exist multiple equilibria and agents may have different preferences among the equilibria. We focus on repeated coordination games of non-identical interest where agents do not know the game structure up front and receive noisy payoffs. We design efficient near-optimal algorithms for both the perfect monitoring and the imperfect monitoring setting(where the agents only observe their own payoffs and the joint actions).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found