AITopics | Agent Societies

Collaborating Authors

Agent Societies

News Overviews Instructional Materials AI-Alerts Classics

Multi-Agent Path Finding on Strongly Connected Digraphs: feasibility and solution algorithms

Ardizzoni, Stefano, Saccani, Irene, Consolini, Luca, Locatelli, Marco

arXiv.org Artificial IntelligenceSep-9-2022

On an assigned graph, the problem of Multi-Agent Pathfinding (MAPF) consists in finding paths for multiple agents, avoiding collisions. Finding the minimum-length solution is known to be NP-hard, and computation times grows exponentially with the number of agents. However, in industrial applications, it is important to find feasible, suboptimal solutions, in a time that grows polynomially with the number of agents. Such algorithms exist for undirected and biconnected directed graphs. Our main contribution is to generalize these algorithms to the more general case of strongly connected directed graphs. In particular, given a MAPF problem with at least two holes, we present an algorithm that checks the problem feasibility in linear time with respect to the number of nodes, and provides a feasible solution in polynomial time.

artificial intelligence, configuration, digraph, (16 more...)

arXiv.org Artificial Intelligence

2209.04286

Country: Europe > Italy (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

On the Near-Optimality of Local Policies in Large Cooperative Multi-Agent Reinforcement Learning

Mondal, Washim Uddin, Aggarwal, Vaneet, Ukkusuri, Satish V.

arXiv.org Artificial IntelligenceSep-7-2022

We show that in a cooperative $N$-agent network, one can design locally executable policies for the agents such that the resulting discounted sum of average rewards (value) well approximates the optimal value computed over all (including non-local) policies. Specifically, we prove that, if $|\mathcal{X}|, |\mathcal{U}|$ denote the size of state, and action spaces of individual agents, then for sufficiently small discount factor, the approximation error is given by $\mathcal{O}(e)$ where $e\triangleq \frac{1}{\sqrt{N}}\left[\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}\right]$. Moreover, in a special case where the reward and state transition functions are independent of the action distribution of the population, the error improves to $\mathcal{O}(e)$ where $e\triangleq \frac{1}{\sqrt{N}}\sqrt{|\mathcal{X}|}$. Finally, we also devise an algorithm to explicitly construct a local policy. With the help of our approximation results, we further establish that the constructed local policy is within $\mathcal{O}(\max\{e,\epsilon\})$ distance of the optimal policy, and the sample complexity to achieve such a local policy is $\mathcal{O}(\epsilon^{-3})$, for any $\epsilon>0$.

machine learning, marl, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2209.03491

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.82)

Add feedback

Robust Event-Driven Interactions in Cooperative Multi-Agent Learning

Ornia, Daniel Jarne, Mazo, Manuel Jr

arXiv.org Artificial IntelligenceSep-7-2022

Lately, with the wide adoption of Deep Learning techniques for compact representations of value functions and policies in model-free problems [16, 21, 34], the field of Multi-Agent Reinforcement Learning (MARL) has seen an explosion in the applications of such algorithms to solve real-world problems [19]. However, this has naturally led to a trend where both the amount of data handled in such data driven approaches and the complexity of the targeted problems grow exponentially. In a MARL setting where communication between agents is required, this may inevitably lead to restrictive requirements in the frequency and reliability of the communication to and from each agents (as it was already pointed out in [23]). The effect of asynchronous communication in dynamic programming problems was studied already in [2]. In particular, one of the first examples of how communication affects learning and policy performance in MARL is found in [31], where the author investigates the impact of agents sharing different combinations of state variable subsets or Q values.

agent, communication, robust event-driven interaction, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-15839-1_16

2204.03361

Country:

Europe > Netherlands > South Holland > Delft (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

A New Approach to Training Multiple Cooperative Agents for Autonomous Driving

Yang, Ruiyang, Li, Siheng, Jin, Beihong

arXiv.org Artificial IntelligenceSep-5-2022

Training multiple agents to perform safe and cooperative control in the complex scenarios of autonomous driving has been a challenge. For a small fleet of cars moving together, this paper proposes Lepus, a new approach to training multiple agents. Lepus adopts a pure cooperative manner for training multiple agents, featured with the shared parameters of policy networks and the shared reward function of multiple agents. In particular, Lepus pre-trains the policy networks via an adversarial process, improving its collaborative decision-making capability and further the stability of car driving. Moreover, for alleviating the problem of sparse rewards, Lepus learns an approximate reward function from expert trajectories by combining a random network and a distillation network. We conduct extensive experiments on the MADRaS simulation platform. The experimental results show that multiple agents trained by Lepus can avoid collisions as many as possible while driving simultaneously and outperform the other four methods, that is, DDPG-FDE, PSDDPG, MADDPG, and MAGAIL(DDPG) in terms of stability.

agent, car agent, policy network, (16 more...)

arXiv.org Artificial Intelligence

2209.02157

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Road (0.63)
Information Technology > Robotics & Automation (0.63)
Automobiles & Trucks (0.63)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.86)

Add feedback

#IJCAI invited talk: engineering social and collaborative agents with Ana Paiva

RobohubSep-4-2022, 08:53:33 GMT

The 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence (IJACI-ECAI 2022) took place from 23-29 July, in Vienna. The title of her talk was "Engineering sociality and collaboration in AI systems". Robots are widely used in industrial settings, but what happens when they enter our everyday world, and, specifically, social situations? Ana believes that social robots, chatbots and social agents have the potential to change the way we interact with technology. She envisages a hybrid society where humans and AI systems work in tandem.

architecture, collaborative agent, robot, (8 more...)

Robohub

Country:

Europe > Austria > Vienna (0.25)
Europe > Portugal > Lisbon > Lisbon (0.05)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.36)

Add feedback

#IJCAI invited talk: engineering social and collaborative agents with Ana Paiva

AIHubSep-2-2022, 13:50:21 GMT

The 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence (IJACI-ECAI 2022) took place from 23-29 July, in Vienna. In this post, we continue our round-up of the invited talks, summarising the presentation by Ana Paiva, University of Lisbon and INESC-ID. The title of her talk was "Engineering sociality and collaboration in AI systems". Robots are widely used in industrial settings, but what happens when they enter our everyday world, and, specifically, social situations? Ana believes that social robots, chatbots and social agents have the potential to change the way we interact with technology.

architecture, collaborative agent, robot, (8 more...)

AIHub

Country:

Europe > Portugal > Lisbon > Lisbon (0.25)
Europe > Austria > Vienna (0.25)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

Learning Practical Communication Strategies in Cooperative Multi-Agent Reinforcement Learning

Hu, Diyi, Zhang, Chi, Prasanna, Viktor, Krishnamachari, Bhaskar

arXiv.org Artificial IntelligenceSep-2-2022

In Multi-Agent Reinforcement Learning, communication is critical to encourage cooperation among agents. Communication in realistic wireless networks can be highly unreliable due to network conditions varying with agents' mobility, and stochasticity in the transmission process. We propose a framework to learn practical communication strategies by addressing three fundamental questions: (1) When: Agents learn the timing of communication based on not only message importance but also wireless channel conditions. (2) What: Agents augment message contents with wireless network measurements to better select the game and communication actions. (3) How: Agents use a novel neural message encoder to preserve all information from received messages, regardless of the number and order of messages. Simulating standard benchmarks under realistic wireless network settings, we show significant improvements in game performance, convergence speed and communication efficiency compared with state-of-the-art.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2209.01288

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Cooperative Online Learning in Stochastic and Adversarial MDPs

Lancewicki, Tal, Rosenberg, Aviv, Mansour, Yishay

arXiv.org Artificial IntelligenceSep-1-2022

We study cooperative online learning in stochastic and adversarial Markov decision process (MDP). That is, in each episode, $m$ agents interact with an MDP simultaneously and share information in order to minimize their individual regret. We consider environments with two types of randomness: \emph{fresh} -- where each agent's trajectory is sampled i.i.d, and \emph{non-fresh} -- where the realization is shared by all agents (but each agent's trajectory is also affected by its own actions). More precisely, with non-fresh randomness the realization of every cost and transition is fixed at the start of each episode, and agents that take the same action in the same state at the same time observe the same cost and next state. We thoroughly analyze all relevant settings, highlight the challenges and differences between the models, and prove nearly-matching regret lower and upper bounds. To our knowledge, we are the first to consider cooperative reinforcement learning (RL) with either non-fresh randomness or in adversarial MDPs.

agent, cooperative online learning, learning, (13 more...)

arXiv.org Artificial Intelligence

2201.1317

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Baltimore (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Education > Educational Setting > Online (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Scalable Model-based Policy Optimization for Decentralized Networked Systems

Du, Yali, Ma, Chengdong, Liu, Yuchen, Lin, Runji, Dong, Hao, Wang, Jun, Yang, Yaodong

arXiv.org Artificial IntelligenceSep-1-2022

Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly requiring communications or shifting or resources. This work aims to improve data efficiency of multi-agent control by model-based learning. We consider networked systems where agents are cooperative and communicate only locally with their neighbors, and propose the decentralized model-based policy optimization framework (DMPO). In our method, each agent learns a dynamic model to predict future states and broadcast their predictions by communication, and then the policies are trained under the model rollouts. To alleviate the bias of model-generated data, we restrain the model usage for generating myopic rollouts, thus reducing the compounding error of model generation. To pertain the independence of policy update, we introduce extended value function and theoretically prove that the resulting policy gradient is a close approximation to true policy gradients. We evaluate our algorithm on several benchmarks for intelligent transportation systems, which are connected autonomous vehicle control tasks (Flow and CACC) and adaptive traffic signal control (ATSC). Empirically results show that our method achieves superior data efficiency and matches the performance of model-free methods using true models.

agent, algorithm, value function, (17 more...)

arXiv.org Artificial Intelligence

2207.06559

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > China > Fujian Province > Xiamen (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.83)

Industry:

Transportation > Ground > Road (0.88)
Transportation > Infrastructure & Services (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Learning Equilibria in Mean-Field Games: Introducing Mean-Field PSRO

Muller, Paul, Rowland, Mark, Elie, Romuald, Piliouras, Georgios, Perolat, Julien, Lauriere, Mathieu, Marinier, Raphael, Pietquin, Olivier, Tuyls, Karl

arXiv.org Artificial IntelligenceAug-29-2022

Recent advances in multiagent learning have seen the introduction ofa family of algorithms that revolve around the population-based trainingmethod PSRO, showing convergence to Nash, correlated and coarse corre-lated equilibria. Notably, when the number of agents increases, learningbest-responses becomes exponentially more difficult, and as such ham-pers PSRO training methods. The paradigm of mean-field games pro-vides an asymptotic solution to this problem when the considered gamesare anonymous-symmetric. Unfortunately, the mean-field approximationintroduces non-linearities which prevent a straightforward adaptation ofPSRO. Building upon optimization and adversarial regret minimization,this paper sidesteps this issue and introduces mean-field PSRO, an adap-tation of PSRO which learns Nash, coarse correlated and correlated equi-libria in mean-field games. The key is to replace the exact distributioncomputation step by newly-defined mean-field no-adversarial-regret learn-ers, or by black-box optimization. We compare the asymptotic complexityof the approach to standard PSRO, greatly improve empirical bandit con-vergence speed by compressing temporal mixture weights, and ensure itis theoretically robust to payoff noise. Finally, we illustrate the speed andaccuracy of mean-field PSRO on several mean-field games, demonstratingconvergence to strong and weak equilibria.

algorithm, equilibria, equilibrium, (14 more...)

arXiv.org Artificial Intelligence

2111.0835

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback