Goto

Collaborating Authors

 payoff


Learning in Games: Robustness of Fast Convergence

Neural Information Processing Systems

We show that learning algorithms satisfying a low approximate regret property experience fast convergence to approximate optimality in a large class of repeated games. Our property, which simply requires that each learner has small regret compared to a (1+eps)-multiplicative approximation to the best action in hindsight, is ubiquitous among learning algorithms; it is satisfied even by the vanilla Hedge forecaster. Our results improve upon recent work of Syrgkanis et al. in a number of ways. We require only that players observe payoffs under other players' realized actions, as opposed to expected payoffs. We further show that convergence occurs with high probability, and show convergence under bandit feedback.




Understanding Model Selection for Learning in Strategic Environments

Neural Information Processing Systems

The deployment of ever-larger machine learning models reflects a growing consensus that the more expressive the model class one optimizes over--and the more data one has access to--the more one can improve performance. As models get deployed in a variety of real-world scenarios, they inevitably face strategic environments.