Tumer, Kagan


Agent Partitioning with Reward/Utility-Based Impact

AAAI Conferences

Reinforcement learning with reward shaping is a well established but often computationally expensive approach to large multiagent systems. Agent partitioning can reduce this computational complexity by treating each partition of agents as an independent problem. We introduce a novel agent partitioning approach called Reward/Utility-Based Impact (RUBI). RUBI finds an effective partitioning of agents while requiring no prior domain knowledge, improves performance by discovering a non-trivial agent partitioning, and leads to faster simulations. We test RUBI in the Air Traffic Flow Management Problem (ATFMP), where there are tens of thousands of aircraft affecting the system and no obvious similarity metric between agents. When partitioning with RUBI in the ATFMP, there is a 37% increase in performance, with a 510x speed increase over non-partitioning approaches. Additionally, RUBI matches the performance of the current domain-dependent ATFMP gold standard using no prior knowledge and with 10% faster performance.


Multirobot Coordination for Space Exploration

AI Magazine

Teams of artificially intelligent planetary rovers have tremendous potential for space exploration, allowing for reduced cost, increased flexibility and increased reliability. However, having these multiple autonomous devices acting simultaneously leads to a problem of coordination: to achieve the best results, the they should work together. Due to the large distances and harsh environments, a rover must be able to perform a wide variety of tasks with a wide variety of potential teammates in uncertain and unsafe environments. Instead, this article examines tackling this problem through the use of coordinated reinforcement learning: rather than being programmed what to do, the rovers iteratively learn through trial and error to take take actions that lead to high overall system return.


Multirobot Coordination for Space Exploration

AI Magazine

Teams of artificially intelligent planetary rovers have tremendous potential for space exploration, allowing for reduced cost, increased flexibility and increased reliability. However, having these multiple autonomous devices acting simultaneously leads to a problem of coordination: to achieve the best results, the they should work together. This is not a simple task. Due to the large distances and harsh environments, a rover must be able to perform a wide variety of tasks with a wide variety of potential teammates in uncertain and unsafe environments. Directly coding all the necessary rules that can reliably handle all of this coordination and uncertainty is problematic. Instead, this article examines tackling this problem through the use of coordinated reinforcement learning: rather than being programmed what to do, the rovers iteratively learn through trial and error to take take actions that lead to high overall system return. To allow for coordination, yet allow each agent to learn and act independently, we employ state-of-the-art reward shaping techniques. This article uses visualization techniques to break down complex performance indicators into an accessible form, and identifies key future research directions.


Multiagent Learning with a Noisy Global Reward Signal

AAAI Conferences

Scaling multiagent reinforcement learning to domains with many agents is a complex problem. In particular, multiagent credit assignment becomes a key issue as the system size increases. Some multiagent systems suffer from a global reward signal that is very noisy or difficult to analyze. This makes deriving a learnable local reward signal very difficult. Difference rewards (a particular instance of reward shaping) have been used to alleviate this concern, but they remain difficult to compute in many domains. In this paper we present an approach to modeling the global reward using function approximation that allows the quick computation of local rewards. We demonstrate how this model can result in significant improvements in behavior for three congestion problems: a multiagent ``bar problem'', a complex simulation of the United States airspace, and a generic air traffic domain. We show how the model of the global reward may be either learned on- or off-line using either linear functions or neural networks. For the bar problem, we show an increase in reward of nearly 200% over learning using the global reward directly. For the air traffic problem, we show a decrease in costs of 25% over learning using the global reward directly.


Elo Ratings for Structural Credit Assignment in Multiagent Systems

AAAI Conferences

In this paper we investigate the applications of Elo ratings (originally designed for 2-player chess) to a heterogeneous nonlinear multiagent system to determine an agent's overall impact on its team's performance. Measuring this impact has been attempted in many different ways, including reward shaping; the generation of heirarchies, holarchies, and teams; mechanism design; and the creation of subgoals. We show that in a multiagent system, an Elo rating will accurately reflect an agent's ability to contribute positively to a team's success with no need for any other feedback than a repeated binary win/loss signal. The Elo rating not only measures ``personal" success, but simultaneously success in assisting other agents to perform favorably.



Ten Years of AAMAS: Introduction to the Special Issue

AI Magazine

The articles in this special issue have been specifically commissioned to provide a snapshot of current activity in the autonomous agents and multiagent systems communities.


The 2002 AAAI Spring Symposium Series

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, presented the 2002 Spring Symposium Series, held Monday through Wednesday, 25 to 27 March 2002, at Stanford University. The nine symposia were entitled (1) Acquiring (and Using) Linguistic (and World) Knowledge for Information Access; (2) Artificial Intelligence and Interactive Entertainment; (3) Collaborative Learning Agents; (4) Information Refinement and Revision for Decision Making: Modeling for Diagnostics, Prognostics, and Prediction; (5) Intelligent Distributed and Embedded Systems; (6) Logic-Based Program Synthesis: State of the Art and Future Trends; (7) Mining Answers from Texts and Knowledge Bases; (8) Safe Learning Agents; and (9) Sketch Understanding.


The 2002 AAAI Spring Symposium Series

AI Magazine

The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, presented the 2002 Spring Symposium Series, held Monday through Wednesday, 25 to 27 March 2002, at Stanford University. The nine symposia were entitled (1) Acquiring (and Using) Linguistic (and World) Knowledge for Information Access; (2) Artificial Intelligence and Interactive Entertainment; (3) Collaborative Learning Agents; (4) Information Refinement and Revision for Decision Making: Modeling for Diagnostics, Prognostics, and Prediction; (5) Intelligent Distributed and Embedded Systems; (6) Logic-Based Program Synthesis: State of the Art and Future Trends; (7) Mining Answers from Texts and Knowledge Bases; (8) Safe Learning Agents; and (9) Sketch Understanding.