Goto

Collaborating Authors

 Agents


Is Swarm AI the answer to fears over Artifical Intelligence and jobs?

#artificialintelligence

From Gary Kasparov to Elon Musk, the list of those who say AI needs to be applied such that it augments us, not compete with us, is long. Yet the supply of reports warning that AI threatens jobs doesn't seem to have an end. On the other hand, a new report looking at a technology called Swarm AI may provide a much more benign fix. Speaking at a recent conference, chess legend, Gary Kasparov, said that the public perception of AI has been overly influenced by Hollywood: the reality is far more positive -- Kasparov's take on AI is a reason for optimism Swarms can be intelligent-- there is no great insight here. Those who study Emergence understand this, from ant colonies to cities, great things can be achieved from simpler entities working together.


Design, Benchmarking and Explainability Analysis of a Game-Theoretic Framework towards Energy Efficiency in Smart Infrastructure

arXiv.org Machine Learning

In this paper, we propose a gamification approach as a novel framework for smart building infrastructure with the goal of motivating human occupants to reconsider personal energy usage and to have positive effects on their environment. Human interaction in the context of cyber-physical systems is a core component and consideration in the implementation of any smart building technology. Research has shown that the adoption of human-centric building services and amenities leads to improvements in the operational efficiency of these cyber-physical systems directed towards controlling building energy usage. We introduce a strategy in form of a game-theoretic framework that incorporates humans-in-the-loop modeling by creating an interface to allow building managers to interact with occupants and potentially incentivize energy efficient behavior. Prior works on game theoretic analysis typically rely on the assumption that the utility function of each individual agent is known a priori. Instead, we propose novel utility learning framework for benchmarking that employs robust estimations of occupant actions towards energy efficiency. To improve forecasting performance, we extend the utility learning scheme by leveraging deep bi-directional recurrent neural networks. Using the proposed methods on data gathered from occupant actions for resources such as room lighting, we forecast patterns of energy resource usage to demonstrate the prediction performance of the methods. The results of our study show that we can achieve a highly accurate representation of the ground truth for occupant energy resource usage. We also demonstrate the explainable nature on human decision making towards energy usage inherent in the dataset using graphical lasso and granger causality algorithms. Finally, we open source the de-identified, high-dimensional data pertaining to the energy game-theoretic framework.


Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games

arXiv.org Machine Learning

We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.


MAVEN: Multi-Agent Variational Exploration

arXiv.org Machine Learning

Centralised training with decentralised execution is an important setting for cooperative deep multi-agent reinforcement learning due to communication constraints during execution and computational tractability in training. In this paper, we analyse value-based methods that are known to have superior performance in complex environments [43]. We specifically focus on QMIX [40], the current state-of-the-art in this domain. We show that the representational constraints on the joint action-values introduced by QMIX and similar methods lead to provably poor exploration and suboptimality. Furthermore, we propose a novel approach called MAVEN that hybridises value and policy-based methods by introducing a latent space for hierarchical control. The value-based agents condition their behaviour on the shared latent variable controlled by a hierarchical policy. This allows MAVEN to achieve committed, temporally extended exploration, which is key to solving complex multi-agent tasks. Our experimental results show that MAVEN achieves significant performance improvements on the challenging SMAC domain [43].


Learning from My Partner's Actions: Roles in Decentralized Robot Teams

arXiv.org Artificial Intelligence

When teams of robots collaborate to complete a task, communication is often necessary. Like humans, robot teammates should implicitly communicate through their actions: but interpreting our partner's actions is typically difficult, since a given action may have many different underlying reasons. Here we propose an alternate approach: instead of not being able to infer whether an action is due to exploration, exploitation, or communication, we define separate roles for each agent. Because each role defines a distinct reason for acting (e.g., only exploit, only communicate), teammates now correctly interpret the meaning behind their partner's actions. Our results suggest that leveraging and alternating roles leads to performance comparable to teams that explicitly exchange messages.


Explainable AI for Intelligence Augmentation in Multi-Domain Operations

arXiv.org Artificial Intelligence

Central to the concept of multi-domain operations (MDO) is the utilization of an intelligence, surveillance, and reconnaissance (ISR) network consisting of overlapping systems of remote and autonomous sensors, and human intelligence, distributed among multiple partners. Realising this concept requires advancement in both artificial intelligence (AI) for improved distributed data analytics and intelligence augmentation (IA) for improved human-machine cognition. The contribution of this paper is threefold: (1) we map the coalition situational understanding (CSU) concept to MDO ISR requirements, paying particular attention to the need for assured and explainable AI to allow robust human-machine decision-making where assets are distributed among multiple partners; (2) we present illustrative vignettes for AI and IA in MDO ISR, including human-machine teaming, dense urban terrain analysis, and enhanced asset interoperability; (3) we appraise the state-of-the-art in explainable AI in relation to the vignettes with a focus on human-machine collaboration to achieve more rapid and agile coalition decision-making. The union of these three elements is intended to show the potential value of a CSU approach in the context of MDO ISR, grounded in three distinct use cases, highlighting how the need for explainability in the multi-partner coalition setting is key. Introduction Multi-domain operations (MDO) require the capacity, capability, and endurance to operate across multiple domains -- from dense urban terrain to space and cyberspace -- in contested environments against near-peer adversaries (U.S. Army 2018).


Chat: Humans v Bots Or A Blend Of Both?

#artificialintelligence

This has been a repeated news item in the business media for a few years now. As automated systems and Artificial Intelligence (AI) gets better and better customer service agents will watch their jobs disappear โ€“ stolen by robots and software agents. The reality is far more complex because there are many reasons why a customer gets in touch with customer service. It could be a very simple requirement, like a password reset, or an application for a new mortgage that will require a detailed conversation with a great deal of personal information and documentation. There is no single type of customer service interaction and therefore one of the initial challenges that companies have found is deciding when an interaction can be automated and when a human should handle the interaction.


Do We Trust Artificial Intelligence Agents to Mediate Conflict? Not Entirely - Express Computer

#artificialintelligence

We may listen to facts from Siri or Alexa, or directions from Google Maps or Waze, but would we let a virtual agent enabled by artificial intelligence help mediate conflict among team members? A new study says not just yet. Researchers from the University of Southern California (USC) and the University of Denver created a simulation in which a three-person team was supported by a virtual agent avatar on screen in a mission that was designed to ensure failure and elicit conflict. The study was designed to look at virtual agents as potential mediators to improve team collaboration during conflict mediation. But in the heat of the moment, will we listen to virtual agents?


Multi-agent Inverse Reinforcement Learning for Certain General-sum Stochastic Games

Journal of Artificial Intelligence Research

This paper addresses the problem of multi-agent inverse reinforcement learning (MIRL) in a two-player general-sum stochastic game framework. Five variants of MIRL are considered: uCS-MIRL, advE-MIRL, cooE-MIRL, uCE-MIRL, and uNE-MIRL, each distinguished by its solution concept. Problem uCS-MIRL is a cooperative game in which the agents employ cooperative strategies that aim to maximize the total game value. In problem uCE-MIRL, agents are assumed to follow strategies that constitute a correlated equilibrium while maximizing total game value. Problem uNE-MIRL is similar to uCE-MIRL in total game value maximization, but it is assumed that the agents are playing a Nash equilibrium. Problems advE-MIRL and cooE-MIRL assume agents are playing an adversarial equilibrium and a coordination equilibrium, respectively. We propose novel approaches to address these five problems under the assumption that the game observer either knows or is able to accurately estimate the policies and solution concepts for players. For uCS-MIRL, we first develop a characteristic set of solutions ensuring that the observed bi-policy is a uCS and then apply a Bayesian inverse learning method. For uCE-MIRL, we develop a linear programming problem subject to constraints that define necessary and sufficient conditions for the observed policies to be correlated equilibria. The objective is to choose a solution that not only minimizes the total game value difference between the observed bi-policy and a local uCS, but also maximizes the scale of the solution. We apply a similar treatment to the problem of uNE-MIRL. The remaining two problems can be solved efficiently by taking advantage of solution uniqueness and setting up a convex optimization problem. Results are validated on various benchmark grid-world games.


SafeCritic: Collision-Aware Trajectory Prediction

arXiv.org Machine Learning

Navigating complex urban environments safely is a key to realize fully autonomous systems. Predicting future locations of vulnerable road users, such as pedestrians and cyclists, thus, has received a lot of attention in the recent years. While previous works have addressed modeling interactions with the static (obstacles) and dynamic (humans) environment agents, we address an important gap in trajectory prediction. We propose SafeCritic, a model that synergizes generative adversarial networks for generating multiple "real" trajectories with reinforcement learning to generate "safe" trajectories. The Discriminator evaluates the generated candidates on whether they are consistent with the observed inputs. The Critic network is environmentally aware to prune trajectories that are in collision or are in violation with the environment. The auto-encoding loss stabilizes training and prevents mode-collapse. We demonstrate results on two large scale data sets with a considerable improvement over state-of-the-art. We also show that the Critic is able to classify the safety of trajectories.