AITopics

2012.12689

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Government (1.00)
Banking & Finance > Economy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Kemeth, Felix P., Bertalan, Tom, Thiem, Thomas, Dietrich, Felix, Moon, Sung Joon, Laing, Carlo R., Kevrekidis, Ioannis G.

Learning emergent PDEs in a learned emergent space

arXiv.org Machine LearningDec-23-2020

We extract data-driven, intrinsic spatial coordinates from observations of the dynamics of large systems of coupled heterogeneous agents. These coordinates then serve as an emergent space in which to learn predictive models in the form of partial differential equations (PDEs) for the collective description of the coupled-agent system. They play the role of the independent spatial variables in this PDE (as opposed to the dependent, possibly also data-driven, state variables). This leads to an alternative description of the dynamics, local in these emergent coordinates, thus facilitating an alternative modeling path for complex coupled-agent systems. We illustrate this approach on a system where each agent is a limit cycle oscillator (a so-called Stuart-Landau oscillator); the agents are heterogeneous (they each have a different intrinsic frequency $\omega$) and are coupled through the ensemble average of their respective variables. After fast initial transients, we show that the collective dynamics on a slow manifold can be approximated through a learned model based on local "spatial" partial derivatives in the emergent coordinates. The model is then used for prediction in time, as well as to capture collective bifurcations when system parameters vary. The proposed approach thus integrates the automatic, data-driven extraction of emergent space coordinates parametrizing the agent dynamics, with machine-learning assisted identification of an "emergent PDE" description of the dynamics in this parametrization.

emergent space, limit cycle, oscillator, (14 more...)

arXiv.org Machine Learning

2012.12738

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Journal of Artificial Intelligence ResearchDec-22-2020

An Evaluation of Communication Protocol Languages for Engineering Multiagent Systems

Chopra, Amit K (Lancaster University) | Christie V, Samuel H | Singh, Munindar P.

Communication protocols are central to engineering decentralized multiagent systems. Modern protocol languages are typically formal and address aspects of decentralization, such as asynchrony. However, modern languages differ in important ways in their basic abstractions and operational assumptions. This diversity makes a comparative evaluation of protocol languages a challenging task. We contribute a rich evaluation of diverse and modern protocol languages. Among the selected languages, Scribble is based on session types; Trace-C and Trace-F on trace expressions; HAPN on hierarchical state machines, and BSPL on information causality. Our contribution is four-fold. One, we contribute important criteria for evaluating protocol languages. Two, for each criterion, we compare the languages on the basis of whether they are able to specify elementary protocols that go to the heart of the criterion. Three, for each language, we map our findings to a canonical architecture style for multiagent systems, highlighting where the languages depart from the architecture. Four, we identify design principles for protocol languages as guidance for future research.

agent, projection, protocol, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12212

AI Access Foundation

12212

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom (0.14)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
South America > Brazil > São Paulo (0.04)
(14 more...)

Genre: Research Report (0.34)

Industry: Banking & Finance (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

Leroy, Pascal, Ernst, Damien, Geurts, Pierre, Louppe, Gilles, Pisane, Jonathan, Sabatelli, Matthia

This paper introduces four new algorithms that can be used for tackling multi-agent reinforcement learning (MARL) problems occurring in cooperative settings. All algorithms are based on the Deep Quality-Value (DQV) family of algorithms, a set of techniques that have proven to be successful when dealing with single-agent reinforcement learning problems (SARL). The key idea of DQV algorithms is to jointly learn an approximation of the state-value function $V$, alongside an approximation of the state-action value function $Q$. We follow this principle and generalise these algorithms by introducing two fully decentralised MARL algorithms (IQV and IQV-Max) and two algorithms that are based on the centralised training with decentralised execution training paradigm (QVMix and QVMix-Max). We compare our algorithms with state-of-the-art MARL techniques on the popular StarCraft Multi-Agent Challenge (SMAC) environment. We show competitive results when QVMix and QVMix-Max are compared to well-known MARL techniques such as QMIX and MAVEN and show that QVMix can even outperform them on some of the tested environments, being the algorithm which performs best overall. We hypothesise that this is due to the fact that QVMix suffers less from the overestimation bias of the $Q$ function.

agent, algorithm, qvmix-max, (12 more...)

2012.12062

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Belgium > Wallonia > Liège Province > Liège (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Kocarev, Ljupco, Koteska, Jasna

Digital me ontology and ethics

This paper addresses ontology and ethics of an AI agent called digital me. We define digital me as autonomous, decision-making, and learning agent, representing an individual and having practically immortal own life. It is assumed that digital me is equipped with the big-five personality model, ensuring that it provides a model of some aspects of a strong AI: consciousness, free will, and intentionality. As computer-based personality judgments are more accurate than those made by humans, digital me can judge the personality of the individual represented by the digital me, other individuals' personalities, and other digital me-s. We describe seven ontological qualities of digital me: a) double-layer status of Digital Being versus digital me, b) digital me versus real me, c) mind-digital me and body-digital me, d) digital me versus doppelganger (shadow digital me), e) non-human time concept, f) social quality, g) practical immortality. We argue that with the advancement of AI's sciences and technologies, there exist two digital me thresholds. The first threshold defines digital me having some (rudimentarily) form of consciousness, free will, and intentionality. The second threshold assumes that digital me is equipped with moral learning capabilities, implying that, in principle, digital me could develop their own ethics which significantly differs from human's understanding of ethics. Finally we discuss the implications of digital me metaethics, normative and applied ethics, the implementation of the Golden Rule in digital me-s, and we suggest two sets of normative principles for digital me: consequentialist and duty based digital me principles.

agent, ethical agent, ethics, (15 more...)

2012.14325

Country:

North America > United States > New York (0.04)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games > Chess (1.00)
Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.62)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Distributed Q-Learning with State Tracking for Multi-agent Networked Control

Wang, Hang, Lin, Sen, Jafarkhani, Hamid, Zhang, Junshan

This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. The existing results often assume that agents can observe the global system state, which may be infeasible in large-scale systems due to privacy concerns or communication constraints. In this work, we consider a setting with unknown system models and no centralized coordinator. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents. Specifically, we assume that agents maintain local estimates of the global state based on their local information and communications with neighbors. At each step, every agent updates its local global state estimation, based on which it solves an approximate Q-factor locally through policy iteration. Assuming decaying injected excitation noise during the policy evaluation, we prove that the local estimation converges to the true global state, and establish the convergence of the proposed distributed ST-based Q-learning algorithm. The experimental studies corroborate our theoretical results by showing that our proposed method achieves comparable performance with the centralized case.

agent, controller, q-learning, (16 more...)

2012.12383

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Mercuur, Rijk, Dignum, Virginia, Jonker, Catholijn M.

Modelling Human Routines: Conceptualising Social Practice Theory for Agent-Based Simulation

Our routines play an important role in a wide range of social challenges such as climate change, disease outbreaks and coordinating staff and patients in a hospital. To use agent-based simulations (ABS) to understand the role of routines in social challenges we need an agent framework that integrates routines. This paper provides the domain-independent Social Practice Agent (SoPrA) framework that satisfies requirements from the literature to simulate our routines. By choosing the appropriate concepts from the literature on agent theory, social psychology and social practice theory we ensure SoPrA correctly depicts current evidence on routines. By creating a consistent, modular and parsimonious framework suitable for multiple domains we enhance the usability of SoPrA. SoPrA provides ABS researchers with a conceptual, formal and computational framework to simulate routines and gain new insights into social systems.

agent, collective view, relation, (17 more...)

2012.11903

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Virginia (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre:

Overview (0.47)
Research Report (0.40)

Industry: Health & Medicine > Health Care Providers & Services (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Kouzehgar, Maryam, Meghjani, Malika, Bouffanais, Roland

Multi-Agent Reinforcement Learning for Dynamic Ocean Monitoring by a Swarm of Buoys

arXiv.org Artificial IntelligenceDec-21-2020

Autonomous marine environmental monitoring problem traditionally encompasses an area coverage problem which can only be effectively carried out by a multi-robot system. In this paper, we focus on robotic swarms that are typically operated and controlled by means of simple swarming behaviors obtained from a subtle, yet ad hoc combination of bio-inspired strategies. We propose a novel and structured approach for area coverage using multi-agent reinforcement learning (MARL) which effectively deals with the non-stationarity of environmental features. Specifically, we propose two dynamic area coverage approaches: (1) swarm-based MARL, and (2) coverage-range-based MARL. The former is trained using the multi-agent deep deterministic policy gradient (MADDPG) approach whereas, a modified version of MADDPG is introduced for the latter with a reward function that intrinsically leads to a collective behavior. Both methods are tested and validated with different geometric shaped regions with equal surface area (square vs. rectangle) yielding acceptable area coverage, and benefiting from the structured learning in non-stationary environments. Both approaches are advantageous compared to a na\"{i}ve swarming method. However, coverage-range-based MARL outperforms the swarm-based MARL with stronger convergence features in learning criteria and higher spreading of agents for area coverage.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

doi: 10.1109/IEEECONF38699.2020.9389128

2012.11641

Country:

Asia > Singapore > Central Region > Singapore (0.04)
North America > United States (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Zakershahrak, Mehrdad, Ghodratnama, Samira

Are We On The Same Page? Hierarchical Explanation Generation for Planning Tasks in Human-Robot Teaming using Reinforcement Learning

arXiv.org Artificial IntelligenceDec-21-2020

Providing explanations is considered an imperative ability for an AI agent in a human-robot teaming framework. The right explanation provides the rationale behind an AI agent's decision making. However, to maintain the human teammate's cognitive demand to comprehend the provided explanations, prior works have focused on providing explanations in a specific order or intertwining the explanation generation with plan execution. These approaches, however, do not consider the degree of details they share throughout the provided explanations. In this work, we argue that the explanations, especially the complex ones, should be abstracted to be aligned with the level of details the teammate desires to maintain the cognitive load of the recipient. The challenge here is to learn a hierarchical model of explanations and details the agent requires to yield the explanations as an objective. Moreover, the agent needs to follow a high-level plan in a task domain such that the agent can transfer learned teammate preferences to a scenario where lower-level control policies are different, while the high-level plan remains the same. Results confirmed our hypothesis that the process of understanding an explanation was a dynamic hierarchical process. The human preference that reflected this aspect corresponded exactly to creating and employing abstraction for knowledge assimilation hidden deeper in our cognitive process. We showed that hierarchical explanations achieved better task performance and behavior interpretability while reduced cognitive load. These results shed light on designing explainable agents utilizing reinforcement learning and planning across various domains.

agent, explanation, robot, (12 more...)

2012.11792

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Liu, Yuejiang, Yan, Qi, Alahi, Alexandre

Social NCE: Contrastive Learning of Socially-aware Motion Representations

arXiv.org Artificial IntelligenceDec-21-2020

Learning socially-aware motion representations is at the core of recent advances in human trajectory forecasting and robot navigation in crowded spaces. Yet existing methods often struggle to generalize to challenging scenarios and even output unacceptable solutions (e.g., collisions). In this work, we propose to address this issue via contrastive learning. Concretely, we introduce a social contrastive loss that encourages the encoded motion representation to preserve sufficient information for distinguishing a positive future event from a set of negative ones. We explicitly draw these negative samples based on our domain knowledge about socially unfavorable scenarios in the multi-agent context. Experimental results show that the proposed method consistently boosts the performance of previous trajectory forecasting, behavioral cloning, and reinforcement learning algorithms in various settings. Our method makes little assumptions about neural architecture designs, and hence can be used as a generic way to incorporate negative data augmentation into motion representation learning.

arxiv, international conference, learning, (15 more...)

2012.11717

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)