AITopics | Albrecht, Stefano

$TAR^2$: Temporal-Agent Reward Redistribution for Optimal Policy Preservation in Multi-Agent Reinforcement Learning

Kapoor, Aditya, Tessera, Kale-ab, Baranwal, Mayank, Khadilkar, Harshad, Albrecht, Stefano, Sun, Mingfei

arXiv.org Artificial IntelligenceFeb-7-2025

In cooperative multi-agent reinforcement learning (MARL), learning effective policies is challenging when global rewards are sparse and delayed. This difficulty arises from the need to assign credit across both agents and time steps, a problem that existing methods often fail to address in episodic, long-horizon tasks. We propose Temporal-Agent Reward Redistribution $TAR^2$, a novel approach that decomposes sparse global rewards into agent-specific, time-step-specific components, thereby providing more frequent and accurate feedback for policy learning. Theoretically, we show that $TAR^2$ (i) aligns with potential-based reward shaping, preserving the same optimal policies as the original environment, and (ii) maintains policy gradient update directions identical to those under the original sparse reward, ensuring unbiased credit signals. Empirical results on two challenging benchmarks, SMACLite and Google Research Football, demonstrate that $TAR^2$ significantly stabilizes and accelerates convergence, outperforming strong baselines like AREL and STAS in both learning speed and final performance. These findings establish $TAR^2$ as a principled and practical solution for agent-temporal credit assignment in sparse-reward multi-agent systems.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2502.04864

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Reports of the 2016 AAAI Workshop Program

Albrecht, Stefano (The University of Texas at Austin) | Bouchard, Bruno (Université du Québec à Chicoutimi) | Brownstein, John S. (Harvard University) | Buckeridge, David L. (McGill University) | Caragea, Cornelia (University of North Texas) | Carter, Kevin M. (MIT Lincoln Laboratory) | Darwiche, Adnan (University of California, Los Angeles) | Fortuna, Blaz (Bloomberg L.P. and Jozef Stefan Institute) | Francillette, Yannick (Université du Québec à Chicoutimi) | Gaboury, Sébastien (Université du Québec à Chicoutimi) | Giles, C. Lee (Pennsylvania State University) | Grobelnik, Marko (Jozef Stefan Institute) | Hruschka, Estevam R. (Federal University of São Carlos) | Kephart, Jeffrey O. (IBM Thomas J. Watson Research Center) | Kordjamshidi, Parisa (University of Illinois at Urbana-Champaign) | Lisy, Viliam (University of Alberta) | Magazzeni, Daniele (King's College London) | Marques-Silva, Joao (University of Lisbon) | Marquis, Pierre (Université d'Artois) | Martinez, David (MIT Lincoln Laboratory) | Michalowski, Martin (Adventium Labs) | Shaban-Nejad, Arash (University of California, Berkeley) | Noorian, Zeinab (Ryerson University) | Pontelli, Enrico (New Mexico State University) | Rogers, Alex (University of Oxford) | Rosenthal, Stephanie (Carnegie Mellon University) | Roth, Dan (University of Illinois at Urbana-Champaign) | Sinha, Arunesh (University of Southern California) | Streilein, William (MIT Lincoln Laboratory) | Thiebaux, Sylvie (The Australian National University) | Tran, Son Cao (New Mexico State University) | Wallace, Byron C. (University of Texas at Austin) | Walsh, Toby (University of New South Wales and Data61) | Witbrock, Michael (Lucid AI) | Zhang, Jie (Nanyang Technological University)

AI MagazineOct-7-2016

The Workshop Program of the Association for the Advancement of Artificial Intelligence's Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held at the beginning of the conference, February 12-13, 2016. Workshop participants met and discussed issues with a selected focus -- providing an informal setting for active exchange among researchers, developers and users on topics of current interest. To foster interaction and exchange of ideas, the workshops were kept small, with 25-65 participants. Attendance was sometimes limited to active participants only, but most workshops also allowed general registration by other interested individuals.

artificial intelligence, management and information, workshop, (3 more...)

AI Magazine

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games (0.38)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.38)

Add feedback

Reports of the 2016 AAAI Workshop Program

Albrecht, Stefano (The University of Texas at Austin) | Bouchard, Bruno (Université du Québec à Chicoutimi) | Brownstein, John S. (Harvard University) | Buckeridge, David L. (McGill University) | Caragea, Cornelia (University of North Texas) | Carter, Kevin M. (MIT Lincoln Laboratory) | Darwiche, Adnan (University of California, Los Angeles) | Fortuna, Blaz (Bloomberg L.P. and Jozef Stefan Institute) | Francillette, Yannick (Université du Québec à Chicoutimi) | Gaboury, Sébastien (Université du Québec à Chicoutimi) | Giles, C. Lee (Pennsylvania State University) | Grobelnik, Marko (Jozef Stefan Institute) | Hruschka, Estevam R. (Federal University of São Carlos) | Kephart, Jeffrey O. (IBM Thomas J. Watson Research Center) | Kordjamshidi, Parisa (University of Illinois at Urbana-Champaign) | Lisy, Viliam (University of Alberta) | Magazzeni, Daniele (King's College London) | Marques-Silva, Joao (University of Lisbon) | Marquis, Pierre (Université d'Artois) | Martinez, David (MIT Lincoln Laboratory) | Michalowski, Martin (Adventium Labs) | Shaban-Nejad, Arash (University of California, Berkeley) | Noorian, Zeinab (Ryerson University) | Pontelli, Enrico (New Mexico State University) | Rogers, Alex (University of Oxford) | Rosenthal, Stephanie (Carnegie Mellon University) | Roth, Dan (University of Illinois at Urbana-Champaign) | Sinha, Arunesh (University of Southern California) | Streilein, William (MIT Lincoln Laboratory) | Thiebaux, Sylvie (The Australian National University) | Tran, Son Cao (New Mexico State University) | Wallace, Byron C. (University of Texas at Austin) | Walsh, Toby (University of New South Wales and Data61) | Witbrock, Michael (Lucid AI) | Zhang, Jie (Nanyang Technological University)

AI MagazineOct-7-2016

The Workshop Program of the Association for the Advancement of Artificial Intelligence’s Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held at the beginning of the conference, February 12-13, 2016. Workshop participants met and discussed issues with a selected focus — providing an informal setting for active exchange among researchers, developers and users on topics of current interest. To foster interaction and exchange of ideas, the workshops were kept small, with 25-65 participants. Attendance was sometimes limited to active participants only, but most workshops also allowed general registration by other interested individuals. The AAAI-16 Workshops were an excellent forum for exploring emerging approaches and task areas, for bridging the gaps between AI and other fields or between subfields of AI, for elucidating the results of exploratory research, or for critiquing existing approaches. The fifteen workshops held at AAAI-16 were Artificial Intelligence Applied to Assistive Technologies and Smart Environments (WS-16-01), AI, Ethics, and Society (WS-16-02), Artificial Intelligence for Cyber Security (WS-16-03), Artificial Intelligence for Smart Grids and Smart Buildings (WS-16-04), Beyond NP (WS-16-05), Computer Poker and Imperfect Information Games (WS-16-06), Declarative Learning Based Programming (WS-16-07), Expanding the Boundaries of Health Informatics Using AI (WS-16-08), Incentives and Trust in Electronic Communities (WS-16-09), Knowledge Extraction from Text (WS-16-10), Multiagent Interaction without Prior Coordination (WS-16-11), Planning for Hybrid Systems (WS-16-12), Scholarly Big Data: AI Perspectives, Challenges, and Ideas (WS-16-13), Symbiotic Cognitive Systems (WS-16-14), and World Wide Web and Population Health Intelligence (WS-16-15).

neural network, optimization problem, workshop, (21 more...)

AI Magazine

Country: