AITopics

2111.12961

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Konda, Rohit, Grimsman, David, Marden, Jason

Execution Order Matters in Greedy Algorithms with Limited Information

In this work, we study the multi-agent decision problem where agents try to coordinate to optimize a given system-level objective. While solving for the global optimal is intractable in many cases, the greedy algorithm is a well-studied and efficient way to provide good approximate solutions - notably for submodular optimization problems. Executing the greedy algorithm requires the agents to be ordered and execute a local optimization based on the solutions of the previous agents. However, in limited information settings, passing the solution from the previous agents may be nontrivial, as some agents may not be able to directly communicate with each other. Thus the communication time required to execute the greedy algorithm is closely tied to the order that the agents are given. In this work, we characterize interplay between the communication complexity and agent orderings by showing that the complexity using the best ordering is O(n) and increases considerably to O(n^2) when using the worst ordering. Motivated by this, we also propose an algorithm that can find an ordering and execute the greedy algorithm quickly, in a distributed fashion. We also show that such an execution of the greedy algorithm is advantageous over current methods for distributed submodular maximization.

algorithm, artificial intelligence, communication time, (15 more...)

2111.09154

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
Europe > Switzerland (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.49)

Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning

Zhang, Yizhou, Qu, Guannan, Xu, Pan, Lin, Yiheng, Chen, Zaiwei, Wierman, Adam

We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network. The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewards. To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) algorithm that provably learns a near-globally-optimal policy using only local information. In particular, we show that, despite restricting each agent's attention to only its $\kappa$-hop neighborhood, the agents are able to learn a policy with an optimality gap that decays polynomially in $\kappa$. In addition, we show the finite-sample convergence of LPI to the global optimal policy, which explicitly captures the trade-off between optimality and computational complexity in choosing $\kappa$. Numerical simulations demonstrate the effectiveness of LPI.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2211.17116

Country:

North America > United States > California (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.45)

Towards True Lossless Sparse Communication in Multi-Agent Systems

Karten, Seth, Tucker, Mycal, Kailas, Siva, Sycara, Katia

Communication enables agents to cooperate to achieve their goals. Learning when to communicate, i.e., sparse (in time) communication, and whom to message is particularly important when bandwidth is limited. Recent work in learning sparse individualized communication, however, suffers from high variance during training, where decreasing communication comes at the cost of decreased reward, particularly in cooperative tasks. We use the information bottleneck to reframe sparsity as a representation learning problem, which we show naturally enables lossless sparse communication at lower budgets than prior art. In this paper, we propose a method for true lossless sparsity in communication via Information Maximizing Gated Sparse Multi-Agent Communication (IMGS-MAC). Our model uses two individualized regularization objectives, an information maximization autoencoder and sparse communication loss, to create informative and sparse communication. We evaluate the learned communication `language' through direct causal analysis of messages in non-sparse runs to determine the range of lossless sparse budgets, which allow zero-shot sparsity, and the range of sparse budgets that will inquire a reward loss, which is minimized by our learned gating function with few-shot sparsity. To demonstrate the efficacy of our results, we experiment in cooperative multi-agent tasks where communication is essential for success. We evaluate our model with both continuous and discrete messages. We focus our analysis on a variety of ablations to show the effect of message representations, including their properties, and lossless performance of our model.

artificial intelligence, communication, machine learning, (16 more...)

2212.00115

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Finkelstein, Mira, Liu, Lucy, Schlot, Nitsan Levy, Kolumbus, Yoav, Parkes, David C., Rosenshein, Jeffrey S., Keren, Sarah

Explainable Reinforcement Learning via Model Transforms

Understanding emerging behaviors of reinforcement learning (RL) agents may be difficult since such agents are often trained in complex environments using highly complex decision making procedures. This has given rise to a variety of approaches to explainability in RL that aim to reconcile discrepancies that may arise between the behavior of an agent and the behavior that is anticipated by an observer. Most recent approaches have relied either on domain knowledge that may not always be available, on an analysis of the agent's policy, or on an analysis of specific elements of the underlying environment, typically modeled as a Markov Decision Process (MDP). Our key claim is that even if the underlying model is not fully known (e.g., the transition probabilities have not been accurately learned) or is not maintained by the agent (i.e., when using model-free methods), the model can nevertheless be exploited to automatically generate explanations. For this purpose, we suggest using formal MDP abstractions and transforms, previously used in the literature for expediting the search for optimal policies, to automatically produce explanations. Since such transforms are typically based on a symbolic representation of the environment, they can provide meaningful explanations for gaps between the anticipated and actual agent behavior. We formally define the explainability problem, suggest a class of transforms that can be used for explaining emergent behaviors, and suggest methods that enable efficient search for an explanation. We demonstrate the approach on a set of standard benchmarks.

explanation, machine learning, reinforcement learning, (17 more...)

2209.12006

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
Africa > Benin (0.04)

Genre: Research Report (0.82)

Industry: Transportation > Ground > Road (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Wang, Wenlong, Pfeiffer, Thomas

Decision Market Based Learning For Multi-agent Contextual Bandit Problems

Information is often stored in a distributed and proprietary form, and agents who own information are often self-interested and require incentives to reveal their information. Suitable mechanisms are required to elicit and aggregate such distributed information for decision making. In this paper, we use simulations to investigate the use of decision markets as mechanisms in a multi-agent learning system to aggregate distributed information for decision-making in a contextual bandit problem. The system utilises strictly proper decision scoring rules to assess the accuracy of probabilistic reports from agents, which allows agents to learn to solve the contextual bandit problem jointly. Our simulations show that our multi-agent system with distributed information can be trained as efficiently as a centralised counterpart with a single agent that receives all information. Moreover, we use our system to investigate scenarios with deterministic decision scoring rules which are not incentive compatible. We observe the emergence of more complex dynamics with manipulative behaviour, which agrees with existing theoretical analyses.

agent, artificial intelligence, bayesian inference, (19 more...)

2212.00271

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

arXiv.org Artificial IntelligenceNov-25-2022

An agent-based simulation model of pedestrian evacuation based on Bayesian Nash Equilibrium

Wang, Yiyu, Ge, Jiaqi, Comber, Alexis

Large public gatherings or crowds are commonplace and have been the subject of simulation research in many studies related to crowd management, disaster management and evacuation planning (Babojelić and Novacko 2020). However, in-depth research on pedestrians has been hindered by difficulties such as complex individual behaviours, different disaster characteristics, and varying environmental factors (Wijermans and Templeton 2022). As evacuee behaviour and movement vary in different scenarios, a number of field observations and simulation experiments have been conducted to explore pedestrian flows, movement patterns and potential factors affecting evacuation under different types of emergencies (Rozo et al. 2019; Feng et al. 2021; Sevtsuk and Kalvo 2022). Despite many simulation studies of pedestrian behaviours, few common behavioural features of pedestrian flows have been explored (Vermuyten et al. 2016; Babojelić and Novacko 2020). One of the main obstacles is the lack of experimental datasets that closely match individual movements during evacuations in the real world.

agent, artificial intelligence, modeling & simulation, (16 more...)

doi: 10.18564/jasss.5037

2211.1426

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games (0.71)
Transportation (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Okumura, Keisuke, Tixeuil, Sébastien

Fault-Tolerant Offline Multi-Agent Path Planning

arXiv.org Artificial IntelligenceNov-25-2022

We study a novel graph path planning problem for multiple agents that may crash at runtime, and block part of the workspace. In our setting, agents can detect neighboring crashed agents, and change followed paths at runtime. The objective is then to prepare a set of paths and switching rules for each agent, ensuring that all correct agents reach their destinations without collisions or deadlocks, despite unforeseen crashes of other agents. Such planning is attractive to build reliable multi-robot systems. We present problem formalization, theoretical analysis such as computational complexities, and how to solve this offline planning problem.

agent, artificial intelligence, planning & scheduling, (18 more...)

2211.13908

Country:

Europe > France (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Gupta, Abhi, Moskovitz, Ted, Alvarez-Melis, David, Pacchiano, Aldo

Transfer RL via the Undo Maps Formalism

arXiv.org Artificial IntelligenceNov-25-2022

Transferring knowledge across domains is one of the most fundamental problems in machine learning, but doing so effectively in the context of reinforcement learning remains largely an open problem. Current methods make strong assumptions on the specifics of the task, often lack principled objectives, and -- crucially -- modify individual policies, which might be sub-optimal when the domains differ due to a drift in the state space, i.e., it is intrinsic to the environment and therefore affects every agent interacting with it. To address these drawbacks, we propose TvD: transfer via distribution matching, a framework to transfer knowledge across interactive domains. We approach the problem from a data-centric perspective, characterizing the discrepancy in environments by means of (potentially complex) transformation between their state spaces, and thus posing the problem of transfer as learning to undo this transformation. To accomplish this, we introduce a novel optimization objective based on an optimal transport distance between two distributions over trajectories -- those generated by an already-learned policy in the source domain and a learnable pushforward policy in the target domain. We show this objective leads to a policy update scheme reminiscent of imitation learning, and derive an efficient algorithm to implement it. Our experiments in simple gridworlds show that this method yields successful transfer learning across a wide range of environment transformations.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2211.14469

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Valiente, Rodolfo, Toghi, Behrad, Razzaghpour, Mahdi, Pedarsani, Ramtin, Fallah, Yaser P.

Learning-based social coordination to improve safety and robustness of cooperative autonomous vehicles in mixed traffic

arXiv.org Artificial IntelligenceNov-21-2022

It is expected that autonomous vehicles(AVs) and heterogeneous human-driven vehicles(HVs) will coexist on the same road. The safety and reliability of AVs will depend on their social awareness and their ability to engage in complex social interactions in a socially accepted manner. However, AVs are still inefficient in terms of cooperating with HVs and struggle to understand and adapt to human behavior, which is particularly challenging in mixed autonomy. In a road shared by AVs and HVs, the social preferences or individual traits of HVs are unknown to the AVs and different from AVs, which are expected to follow a policy, HVs are particularly difficult to forecast since they do not necessarily follow a stationary policy. To address these challenges, we frame the mixed-autonomy problem as a multi-agent reinforcement learning (MARL) problem and propose an approach that allows AVs to learn the decision-making of HVs implicitly from experience, account for all vehicles' interests, and safely adapt to other traffic situations. In contrast with existing works, we quantify AVs' social preferences and propose a distributed reward structure that introduces altruism into their decision-making process, allowing the altruistic AVs to learn to establish coalitions and influence the behavior of HVs.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2211.11963

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > California (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Transportation > Infrastructure & Services (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)