Goto

Collaborating Authors

 Bowling, Michael


Convergence and No-Regret in Multiagent Learning

Neural Information Processing Systems

Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment isno longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be able to exploit a learner's particular dynamics. In the worst case, this could result in poorer performance than if the agent was not learning at all. These challenges are identifiable in the two most common evaluationcriteria for multiagent learning algorithms: convergence and regret. Algorithms focusing on convergence or regret in isolation are numerous. In this paper, we seek to address both criteria in a single algorithm by introducing GIGA-WoLF, a learning algorithm for normalform games.We prove the algorithm guarantees at most zero average regret, while demonstrating the algorithm converges in many situations of self-play. We prove convergence in a limited setting and give empirical resultsin a wider variety of situations. These results also suggest a third new learning criterion combining convergence and regret, which we call negative non-convergence regret (NNR).


CMUNITED-98: RoboCup-98 Small-Robot World Champion Team

AI Magazine

The CMUNITED small-robot team became the 1998 RoboCup small-robot league champion, repeating its 1997 victory. This article gives an overview of the cmunited-98 team, focusing on this year's improvements.


CMUNITED-98: RoboCup-98 Small-Robot World Champion Team

AI Magazine

Although our previous and processes the images, giving the positions team had accurate navigation, it was not easily of each robot and the ball. This information is interruptible, which is necessary for operating sent to an off-board controller and distributed in a highly dynamic environment. The final design includes a battery of inherent mechanical inaccuracies and module supplying three independent unforeseen interventions from other agents. It also includes a single board RoboCup competition in Paris (Stone, Veloso, containing all the required electronic circuitry and Riley 1999; Kitano et al. 1997). These improvements by an array of four infrared sensors, which include a robust low-level control algorithm, which handles a moving target with is enabled or disabled by the software control.