Powers, Rob
New Criteria and a New Algorithm for Learning in Multi-Agent Systems
Powers, Rob, Shoham, Yoav
We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or maximin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.
New Criteria and a New Algorithm for Learning in Multi-Agent Systems
Powers, Rob, Shoham, Yoav
We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly inrepeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or maximin value),and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.
DERVISH An Office-Navigating Robot
Nourbakhsh, Illah, Powers, Rob, Birchfield, Stan
DERVISH won the Office Delivery event of the 1994 Robot Competition and Exhibition, held as part of the Thirteenth National Conferennce on Artificial Intelligence. Although the contest required dervish to navigate in an artificial office environment, the official goal of the contest was to push the technology of robot navigation in real office buildings with minimal domain information. In this article, we present a short description of Dervish's hardware and low-level motion modules. We then discuss this assumptive system in more detail.
DERVISH An Office-Navigating Robot
Nourbakhsh, Illah, Powers, Rob, Birchfield, Stan
DERVISH won the Office Delivery event of the 1994 Robot Competition and Exhibition, held as part of the Thirteenth National Conferennce on Artificial Intelligence. Although the contest required dervish to navigate in an artificial office environment, the official goal of the contest was to push the technology of robot navigation in real office buildings with minimal domain information. dervish navigates reliably using retractable assumptions that simplify the planning problem. In this article, we present a short description of Dervish's hardware and low-level motion modules. We then discuss this assumptive system in more detail.