Goto

Collaborating Authors

 multiagent environment


Transformer-based WorkingMemoryforMultiagent ReinforcementLearningwithActionParsing

Neural Information Processing Systems

Learning in real-world multiagent tasks is challenging due to the usual partial observability ofeach agent. Previous efforts alleviate thepartial observability by historical hidden states with Recurrent Neural Networks, however, they do not consider themultiagent characters thateither themultiagent observationconsists ofanumber ofobject entities orthe action space shows clear entity interactions.


PantheonRL: A MARL Library for Dynamic Training Interactions

arXiv.org Artificial Intelligence

We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training. Our package is designed around flexible agent objects that can be easily configured to support different training interactions, and handles fully general multiagent environments with mixed rewards and n agents. Built on top of StableBaselines3, our package works directly with existing powerful deep RL algorithms. Finally, PantheonRL comes with an intuitive yet functional web user interface for configuring experiments and launching multiple asynchronous jobs. Our package can be found at https://github.com/Stanford-ILIAD/PantheonRL.


There's More to Life Than Making Plans: Plan Management in Dynamic, Multiagent Environments

AI Magazine

For many years, research in AI plan generation was governed by a number of strong, simplifying assumptions: The planning agent is omniscient, its actions are deterministic and instantaneous, its goals are fixed and categorical, and its environment is static. More recently, researchers have developed expanded planning algorithms that are not predicated on such assumptions, but changing the way in which plans are formed is only part of what is required when the classical assumptions are abandoned. The demands of dynamic, uncertain environments mean that in addition to being able to form plans -- even probabilistic, uncertain plans -- agents must be able to effectively manage their plans. In this article, which is based on a talk given at the 1998 AAAI Fall Symposium on Distributed, Continual Planning, we first identify reasoning tasks that are involved in plan management, including commitment management, environment monitoring, alternative assessment, plan elaboration, metalevel control, and coordination with other agents. We next survey approaches we have developed to many of these tasks and discuss a plan-management system we are building to ground our theoretical work, by providing us with a platform for integrating our techniques and exploring their value in a realistic problem.


Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments

arXiv.org Artificial Intelligence

Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention. Existing multiagent DRL algorithms are inefficient when facing with the non-stationarity due to agents update their policies simultaneously in stochastic cooperative environments. This paper extends the recently proposed weighted double estimator to the multiagent domain and propose a multiagent DRL framework, named weighted double deep Q-network (WDDQN). By utilizing the weighted double estimator and the deep neural network, WDDQN can not only reduce the bias effectively but also be extended to scenarios with raw visual inputs. To achieve efficient cooperation in the multiagent domain, we introduce the lenient reward network and the scheduled replay strategy. Experiments show that the WDDQN outperforms the existing DRL and multiaent DRL algorithms, i.e., double DQN and lenient Q-learning, in terms of the average reward and the convergence rate in stochastic cooperative environments.


Open AI's Algorithm Can Make These Dots Collaborate to Complete a Task

#artificialintelligence

Artificial intelligence is part of humanity's future, but to get to that society needs to pursue AI responsibly. Though the age of super artificial intelligence could prove to be beneficial to humanity, there seems to be an equal chance that AI could be highly destructive. Billionaire and Tesla CEO Elon Musk have made his opinions quite clear on the future of artificial intelligence stating in an interview, "I think we should be very careful about artificial intelligence. If I had to guess at what our biggest existential threat is, it's probably that. So we need to be very careful. I'm increasingly inclined to think that there should be some regulatory oversight, maybe at the national and international level, just to make sure that we don't do something very foolish."


Learning to Cooperate, Compete, and Communicate

#artificialintelligence

Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum -- the difficulty of the environment is determined by the skill of your competitors (and if you're competing against clones of yourself, the environment exactly matches your skill level). Second, a multiagent environment has no stable equilibrium: no matter how smart an agent is, there's always pressure to get smarter. These environments have a very different feel from traditional environments, and it'll take a lot more research before we become good at them. We've developed a new algorithm, MADDPG, for centralized learning and decentralized execution in multiagent environments, allowing agents to learn to collaborate and compete with each other.


A Market-Based Coordination Mechanism for Resource Planning Under Uncertainty

AAAI Conferences

Multiagent Resource Allocation (MARA) distributes a set of resources among a set of intelligent agents in order to respect the preferences of the agents and to maximize some measure of global utility, which may include minimizing total costs or maximizing total return. We are interested in MARA solutions that provide optimal or close-to-optimal allocation of resources in terms of maximizing a global welfare function with low communication and computation cost, with respect to the priority of agents, and temporal dependencies between resources. We propose an MDP approach for resource planning in multiagent environments. Our approach formulates internal preference modeling and success of each individual agent as a single MDP and then to optimize global utility, we apply a market-based solution to coordinate these decentralized MDPs.


On the influence of intelligence in (social) intelligence testing environments

arXiv.org Artificial Intelligence

This paper analyses the influence of including agents of different degrees of intelligence in a multiagent system. The goal is to better understand how we can develop intelligence tests that can evaluate social intelligence. We analyse several reinforcement algorithms in several contexts of cooperation and competition. Our experimental setting is inspired by the recently developed Darwin-Wallace distribution.