AITopics

2209.01353

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Finland (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Information Technology (0.48)
Education (0.46)
Energy (0.46)
Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

arXiv.org Artificial IntelligenceMay-19-2023

Phantom -- A RL-driven multi-agent framework to model complex systems

Ardon, Leo, Vann, Jared, Garg, Deepeka, Spooner, Tom, Ganesh, Sumitra

Agent based modelling (ABM) is a computational approach to modelling complex systems by specifying the behaviour of autonomous decision-making components or agents in the system and allowing the system dynamics to emerge from their interactions. Recent advances in the field of Multi-agent reinforcement learning (MARL) have made it feasible to study the equilibrium of complex environments where multiple agents learn simultaneously. However, most ABM frameworks are not RL-native, in that they do not offer concepts and interfaces that are compatible with the use of MARL to learn agent behaviours. In this paper, we introduce a new open-source framework, Phantom, to bridge the gap between ABM and MARL. Phantom is an RL-driven framework for agent-based modelling of complex multi-agent systems including, but not limited to economic systems and markets. The framework aims to provide the tools to simplify the ABM specification in a MARL-compatible way - including features to encode dynamic partial observability, agent utility functions, heterogeneity in agent preferences or types, and constraints on the order in which agents can act (e.g. Stackelberg games, or more complex turn-taking environments). In this paper, we present these features, their design rationale and present two new environments leveraging the framework.

advertiser, agent, artificial intelligence, (17 more...)

2210.06012

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)

Belardinelli, Gaia, Bolander, Thomas

Attention! Dynamic Epistemic Logic Models of (In)attentive Agents

arXiv.org Artificial IntelligenceMay-18-2023

Attention is the crucial cognitive ability that limits and selects what information we observe. Previous work by Bolander et al. (2016) proposes a model of attention based on dynamic epistemic logic (DEL) where agents are either fully attentive or not attentive at all. While introducing the realistic feature that inattentive agents believe nothing happens, the model does not represent the most essential aspect of attention: its selectivity. Here, we propose a generalization that allows for paying attention to subsets of atomic formulas. We introduce the corresponding logic for propositional attention, and show its axiomatization to be sound and complete. We then extend the framework to account for inattentive agents that, instead of assuming nothing happens, may default to a specific truth-value of what they failed to attend to (a sort of prior concerning the unattended atoms). This feature allows for a more cognitively plausible representation of the inattentional blindness phenomenon, where agents end up with false beliefs due to their failure to attend to conspicuous but unexpected events. Both versions of the model define attention-based learning through appropriate DEL event models based on a few and clear edge principles. While the size of such event models grow exponentially both with the number of agents and the number of atoms, we introduce a new logical language for describing event models syntactically and show that using this language our event models can be represented linearly in the number of agents and atoms. Furthermore, representing our event models using this language is achieved by a straightforward formalisation of the aforementioned edge principles.

agent, artificial intelligence, event model, (17 more...)

2303.13494

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.54)

Rios, Manuel, Quijano, Nicanor, Giraldo, Luis Felipe

Understanding the World to Solve Social Dilemmas Using Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceMay-18-2023

Social dilemmas are situations where groups of individuals can benefit from mutual cooperation but conflicting interests impede them from doing so. This type of situations resembles many of humanity's most critical challenges, and discovering mechanisms that facilitate the emergence of cooperative behaviors is still an open problem. In this paper, we study the behavior of self-interested rational agents that learn world models in a multi-agent reinforcement learning (RL) setting and that coexist in environments where social dilemmas can arise. Our simulation results show that groups of agents endowed with world models outperform all the other tested ones when dealing with scenarios where social dilemmas can arise. We exploit the world model architecture to qualitatively assess the learnt dynamics and confirm that each agent's world model is capable to encode information of the behavior of the changing environment and the other agent's actions. This is the first work that shows that world models facilitate the emergence of complex coordinated behaviors that enable interacting agents to ``understand'' both environmental and social dynamics.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2305.11358

Country: South America > Colombia > Bogotá D.C. > Bogotá (0.05)

Genre: Research Report > New Finding (0.68)

Industry: Social Sector (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Explainable Multi-Agent Reinforcement Learning for Temporal Queries

Boggess, Kayla, Kraus, Sarit, Feng, Lu

As multi-agent reinforcement learning (MARL) systems are increasingly deployed throughout society, it is imperative yet challenging for users to understand the emergent behaviors of MARL agents in complex environments. This work presents an approach for generating policy-level contrastive explanations for MARL to answer a temporal user query, which specifies a sequence of tasks completed by agents with possible cooperation. The proposed approach encodes the temporal query as a PCTL logic formula and checks if the query is feasible under a given MARL policy via probabilistic model checking. Such explanations can help reconcile discrepancies between the actual and anticipated multi-agent behaviors. The proposed approach also generates correct and complete explanations to pinpoint reasons that make a user query infeasible. We have successfully applied the proposed approach to four benchmark MARL domains (up to 9 agents in one domain). Moreover, the results of a user study show that the generated explanations significantly improve user performance and satisfaction.

explanation, machine learning, reinforcement learning, (18 more...)

2305.10378

Country:

North America > United States > Virginia (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.94)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Waelchli, Daniel, Weber, Pascal, Koumoutsakos, Petros

Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning

The discovery of individual objectives in collective behavior of complex dynamical systems such as fish schools and bacteria colonies is a long-standing challenge. Inverse reinforcement learning is a potent approach for addressing this challenge but its applicability to dynamical systems, involving continuous state-action spaces and multiple interacting agents, has been limited. In this study, we tackle this challenge by introducing an off-policy inverse multi-agent reinforcement learning algorithm (IMARL). Our approach combines the ReF-ER techniques with guided cost learning. By leveraging demonstrations, our algorithm automatically uncovers the reward function and learns an effective policy for the agents. Through extensive experimentation, we demonstrate that the proposed policy captures the behavior observed in the provided data, and achieves promising results across problem domains including single agent models in the OpenAI gym and multi-agent models of schooling behavior. The present study shows that the proposed IMARL algorithm is a significant step towards understanding collective dynamics from the perspective of its constituents, and showcases its value as a tool for studying complex physical systems exhibiting collective behaviour.

machine learning, reinforcement learning, trajectory, (14 more...)

2305.10548

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(13 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Synthesizing Resilient Strategies for Infinite-Horizon Objectives in Multi-Agent Systems

Klaška, David, Kučera, Antonín, Kurečka, Martin, Musil, Vít, Novotný, Petr, Řehák, Vojtěch

We consider the problem of synthesizing resilient and stochastically stable strategies for systems of cooperating agents striving to minimize the expected time between consecutive visits to selected locations in a known environment. A strategy profile is resilient if it retains its functionality even if some of the agents fail, and stochastically stable if the visiting time variance is small. We design a novel specification language for objectives involving resilience and stochastic stability, and we show how to efficiently compute strategy profiles (for both autonomous and coordinated agents) optimizing these objectives. Our experiments show that our strategy synthesis algorithm can construct highly non-trivial and efficient strategy profiles for environments with general topology.

agent, artificial intelligence, configuration, (17 more...)

2305.1007

Country:

Europe > Portugal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)

Ceylan, Esra, Chen, Jiehua, Roy, Sanjukta

Optimal Seat Arrangement: What Are the Hard and Easy Cases?

We study four NP-hard optimal seat arrangement problems [Bodlaender et al., 2020a], which each have as input a set of n agents, where each agent has cardinal preferences over other agents, and an n-vertex undirected graph (called seat graph). The task is to assign each agent to a distinct vertex in the seat graph such that either the sum of utilities or the minimum utility is maximized, or it is envy-free or exchange-stable. Aiming at identifying hard and easy cases, we extensively study the algorithmic complexity of the four problems by looking into natural graph classes for the seat graph (e.g., paths, cycles, stars, or matchings), problem-specific parameters (e.g., the number of non-isolated vertices in the seat graph or the maximum number of agents towards whom an agent has non-zero preferences), and preference structures (e.g., non-negative or symmetric preferences). For strict preferences and seat graphs with disjoint edges and isolated vertices, we correct an error by Bodlaender et al. [2020b] and show that finding an envy-free arrangement remains NP-hard in this case.

agent, artificial intelligence, vertex, (18 more...)

2305.10381

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

arXiv.org Artificial IntelligenceMay-16-2023

Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative Multi-Agent Reinforcement Learning

Xu, Zhiwei, Zhang, Bin, Li, Dapeng, Zhou, Guangchong, Zhang, Zeren, Fan, Guoliang

Value decomposition methods have gained popularity in the field of cooperative multi-agent reinforcement learning. However, almost all existing methods follow the principle of Individual Global Max (IGM) or its variants, which limits their problem-solving capabilities. To address this, we propose a dual self-awareness value decomposition framework, inspired by the notion of dual self-awareness in psychology, that entirely rejects the IGM premise. Each agent consists of an ego policy for action selection and an alter ego value function to solve the credit assignment problem. The value function factorization can ignore the IGM assumption by utilizing an explicit search procedure. On the basis of the above, we also suggest a novel anti-ego exploration mechanism to avoid the algorithm becoming stuck in a local optimum. As the first fully IGM-free value decomposition method, our proposed framework achieves desirable performance in various cooperative tasks.

joint action, machine learning, reinforcement learning, (15 more...)

2302.0218

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceMay-15-2023

Curriculum Learning for Relative Overgeneralization

Shi, Lin, Peng, Bei

In multi-agent reinforcement learning (MARL), many popular methods, such as VDN and QMIX, are susceptible to a critical multi-agent pathology known as relative overgeneralization (RO), which arises when the optimal joint action's utility falls below that of a sub-optimal joint action in cooperative tasks. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks that require significant coordination between agents within a given timestep. Recent value-based MARL algorithms such as QPLEX and WQMIX can overcome RO to some extent. However, our experimental results show that they can still fail to solve cooperative tasks that exhibit strong RO. In this work, we propose a novel approach called curriculum learning for relative overgeneralization (CURO) to better overcome RO. To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks that are tailored to the current ability of the learning agent and train the agent on these source tasks first. Then, to effectively transfer the knowledge acquired in one task to the next, we use a transfer learning method that combines value function transfer with buffer transfer, which enables more efficient exploration in the target task. We demonstrate that, when applied to QMIX, CURO overcomes severe RO problem and significantly improves performance, yielding state-of-the-art results in a variety of cooperative multi-agent tasks, including the challenging StarCraft II micromanagement benchmarks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2212.02733

Country:

Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.49)