Goto

Collaborating Authors

 Agents


The Less Intelligent the Elements, the More Intelligent the Whole. Or, Possibly Not?

arXiv.org Artificial Intelligence

We dare to make use of a possible analogy between neurons in a brain and people in society, asking ourselves whether individual intelligence is necessary in order to collective wisdom to emerge and, most importantly, what sort of individual intelligence is conducive of greater collective wisdom. We review insights and findings from connectionism, agent-based modeling, group psychology, economics and physics, casting them in terms of changing structure of the system's Lyapunov function. Finally, we apply these insights to the sort and degrees of intelligence of preys and predators in the Lotka-Volterra model, explaining why certain individual understandings lead to co-existence of the two species whereas other usages of their individual intelligence cause global extinction.


Learning emergent PDEs in a learned emergent space

arXiv.org Machine Learning

We extract data-driven, intrinsic spatial coordinates from observations of the dynamics of large systems of coupled heterogeneous agents. These coordinates then serve as an emergent space in which to learn predictive models in the form of partial differential equations (PDEs) for the collective description of the coupled-agent system. They play the role of the independent spatial variables in this PDE (as opposed to the dependent, possibly also data-driven, state variables). This leads to an alternative description of the dynamics, local in these emergent coordinates, thus facilitating an alternative modeling path for complex coupled-agent systems. We illustrate this approach on a system where each agent is a limit cycle oscillator (a so-called Stuart-Landau oscillator); the agents are heterogeneous (they each have a different intrinsic frequency $\omega$) and are coupled through the ensemble average of their respective variables. After fast initial transients, we show that the collective dynamics on a slow manifold can be approximated through a learned model based on local "spatial" partial derivatives in the emergent coordinates. The model is then used for prediction in time, as well as to capture collective bifurcations when system parameters vary. The proposed approach thus integrates the automatic, data-driven extraction of emergent space coordinates parametrizing the agent dynamics, with machine-learning assisted identification of an "emergent PDE" description of the dynamics in this parametrization.


An Evaluation of Communication Protocol Languages for Engineering Multiagent Systems

Journal of Artificial Intelligence Research

Communication protocols are central to engineering decentralized multiagent systems. Modern protocol languages are typically formal and address aspects of decentralization, such as asynchrony. However, modern languages differ in important ways in their basic abstractions and operational assumptions. This diversity makes a comparative evaluation of protocol languages a challenging task. We contribute a rich evaluation of diverse and modern protocol languages. Among the selected languages, Scribble is based on session types; Trace-C and Trace-F on trace expressions; HAPN on hierarchical state machines, and BSPL on information causality. Our contribution is four-fold. One, we contribute important criteria for evaluating protocol languages. Two, for each criterion, we compare the languages on the basis of whether they are able to specify elementary protocols that go to the heart of the criterion. Three, for each language, we map our findings to a canonical architecture style for multiagent systems, highlighting where the languages depart from the architecture. Four, we identify design principles for protocol languages as guidance for future research.


QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

This paper introduces four new algorithms that can be used for tackling multi-agent reinforcement learning (MARL) problems occurring in cooperative settings. All algorithms are based on the Deep Quality-Value (DQV) family of algorithms, a set of techniques that have proven to be successful when dealing with single-agent reinforcement learning problems (SARL). The key idea of DQV algorithms is to jointly learn an approximation of the state-value function $V$, alongside an approximation of the state-action value function $Q$. We follow this principle and generalise these algorithms by introducing two fully decentralised MARL algorithms (IQV and IQV-Max) and two algorithms that are based on the centralised training with decentralised execution training paradigm (QVMix and QVMix-Max). We compare our algorithms with state-of-the-art MARL techniques on the popular StarCraft Multi-Agent Challenge (SMAC) environment. We show competitive results when QVMix and QVMix-Max are compared to well-known MARL techniques such as QMIX and MAVEN and show that QVMix can even outperform them on some of the tested environments, being the algorithm which performs best overall. We hypothesise that this is due to the fact that QVMix suffers less from the overestimation bias of the $Q$ function.


Digital me ontology and ethics

arXiv.org Artificial Intelligence

This paper addresses ontology and ethics of an AI agent called digital me. We define digital me as autonomous, decision-making, and learning agent, representing an individual and having practically immortal own life. It is assumed that digital me is equipped with the big-five personality model, ensuring that it provides a model of some aspects of a strong AI: consciousness, free will, and intentionality. As computer-based personality judgments are more accurate than those made by humans, digital me can judge the personality of the individual represented by the digital me, other individuals' personalities, and other digital me-s. We describe seven ontological qualities of digital me: a) double-layer status of Digital Being versus digital me, b) digital me versus real me, c) mind-digital me and body-digital me, d) digital me versus doppelganger (shadow digital me), e) non-human time concept, f) social quality, g) practical immortality. We argue that with the advancement of AI's sciences and technologies, there exist two digital me thresholds. The first threshold defines digital me having some (rudimentarily) form of consciousness, free will, and intentionality. The second threshold assumes that digital me is equipped with moral learning capabilities, implying that, in principle, digital me could develop their own ethics which significantly differs from human's understanding of ethics. Finally we discuss the implications of digital me metaethics, normative and applied ethics, the implementation of the Golden Rule in digital me-s, and we suggest two sets of normative principles for digital me: consequentialist and duty based digital me principles.


Distributed Q-Learning with State Tracking for Multi-agent Networked Control

arXiv.org Artificial Intelligence

This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. The existing results often assume that agents can observe the global system state, which may be infeasible in large-scale systems due to privacy concerns or communication constraints. In this work, we consider a setting with unknown system models and no centralized coordinator. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents. Specifically, we assume that agents maintain local estimates of the global state based on their local information and communications with neighbors. At each step, every agent updates its local global state estimation, based on which it solves an approximate Q-factor locally through policy iteration. Assuming decaying injected excitation noise during the policy evaluation, we prove that the local estimation converges to the true global state, and establish the convergence of the proposed distributed ST-based Q-learning algorithm. The experimental studies corroborate our theoretical results by showing that our proposed method achieves comparable performance with the centralized case.


Modelling Human Routines: Conceptualising Social Practice Theory for Agent-Based Simulation

arXiv.org Artificial Intelligence

Our routines play an important role in a wide range of social challenges such as climate change, disease outbreaks and coordinating staff and patients in a hospital. To use agent-based simulations (ABS) to understand the role of routines in social challenges we need an agent framework that integrates routines. This paper provides the domain-independent Social Practice Agent (SoPrA) framework that satisfies requirements from the literature to simulate our routines. By choosing the appropriate concepts from the literature on agent theory, social psychology and social practice theory we ensure SoPrA correctly depicts current evidence on routines. By creating a consistent, modular and parsimonious framework suitable for multiple domains we enhance the usability of SoPrA. SoPrA provides ABS researchers with a conceptual, formal and computational framework to simulate routines and gain new insights into social systems.


Multi-Agent Reinforcement Learning for Dynamic Ocean Monitoring by a Swarm of Buoys

arXiv.org Artificial Intelligence

Autonomous marine environmental monitoring problem traditionally encompasses an area coverage problem which can only be effectively carried out by a multi-robot system. In this paper, we focus on robotic swarms that are typically operated and controlled by means of simple swarming behaviors obtained from a subtle, yet ad hoc combination of bio-inspired strategies. We propose a novel and structured approach for area coverage using multi-agent reinforcement learning (MARL) which effectively deals with the non-stationarity of environmental features. Specifically, we propose two dynamic area coverage approaches: (1) swarm-based MARL, and (2) coverage-range-based MARL. The former is trained using the multi-agent deep deterministic policy gradient (MADDPG) approach whereas, a modified version of MADDPG is introduced for the latter with a reward function that intrinsically leads to a collective behavior. Both methods are tested and validated with different geometric shaped regions with equal surface area (square vs. rectangle) yielding acceptable area coverage, and benefiting from the structured learning in non-stationary environments. Both approaches are advantageous compared to a na\"{i}ve swarming method. However, coverage-range-based MARL outperforms the swarm-based MARL with stronger convergence features in learning criteria and higher spreading of agents for area coverage.


Are We On The Same Page? Hierarchical Explanation Generation for Planning Tasks in Human-Robot Teaming using Reinforcement Learning

arXiv.org Artificial Intelligence

Providing explanations is considered an imperative ability for an AI agent in a human-robot teaming framework. The right explanation provides the rationale behind an AI agent's decision making. However, to maintain the human teammate's cognitive demand to comprehend the provided explanations, prior works have focused on providing explanations in a specific order or intertwining the explanation generation with plan execution. These approaches, however, do not consider the degree of details they share throughout the provided explanations. In this work, we argue that the explanations, especially the complex ones, should be abstracted to be aligned with the level of details the teammate desires to maintain the cognitive load of the recipient. The challenge here is to learn a hierarchical model of explanations and details the agent requires to yield the explanations as an objective. Moreover, the agent needs to follow a high-level plan in a task domain such that the agent can transfer learned teammate preferences to a scenario where lower-level control policies are different, while the high-level plan remains the same. Results confirmed our hypothesis that the process of understanding an explanation was a dynamic hierarchical process. The human preference that reflected this aspect corresponded exactly to creating and employing abstraction for knowledge assimilation hidden deeper in our cognitive process. We showed that hierarchical explanations achieved better task performance and behavior interpretability while reduced cognitive load. These results shed light on designing explainable agents utilizing reinforcement learning and planning across various domains.


Social NCE: Contrastive Learning of Socially-aware Motion Representations

arXiv.org Artificial Intelligence

Learning socially-aware motion representations is at the core of recent advances in human trajectory forecasting and robot navigation in crowded spaces. Yet existing methods often struggle to generalize to challenging scenarios and even output unacceptable solutions (e.g., collisions). In this work, we propose to address this issue via contrastive learning. Concretely, we introduce a social contrastive loss that encourages the encoded motion representation to preserve sufficient information for distinguishing a positive future event from a set of negative ones. We explicitly draw these negative samples based on our domain knowledge about socially unfavorable scenarios in the multi-agent context. Experimental results show that the proposed method consistently boosts the performance of previous trajectory forecasting, behavioral cloning, and reinforcement learning algorithms in various settings. Our method makes little assumptions about neural architecture designs, and hence can be used as a generic way to incorporate negative data augmentation into motion representation learning.