Convergence and No-Regret in Multiagent Learning

Dec-31-2005–Neural Information Processing Systems

Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment isno longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be able to exploit a learner's particular dynamics. In the worst case, this could result in poorer performance than if the agent was not learning at all. These challenges are identifiable in the two most common evaluationcriteria for multiagent learning algorithms: convergence and regret. Algorithms focusing on convergence or regret in isolation are numerous. In this paper, we seek to address both criteria in a single algorithm by introducing GIGA-WoLF, a learning algorithm for normalform games.We prove the algorithm guarantees at most zero average regret, while demonstrating the algorithm converges in many situations of self-play. We prove convergence in a limited setting and give empirical resultsin a wider variety of situations. These results also suggest a third new learning criterion combining convergence and regret, which we call negative non-convergence regret (NNR).

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Dec-31-2005

Conferences PDF

Add feedback

Country:
- North America > Canada > Alberta (0.29)

Industry:
- Leisure & Entertainment > Games (0.95)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
Convergence and No-Regret in Multiagent Learning
Convergence and No-Regret in Multiagent Learning

Similar Docs Excel Report more

Title	Similarity	Source
None found