Goto

Collaborating Authors

 Agents


Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach

arXiv.org Machine Learning

In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the loss of energy shortfall of the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models.


Semantic Web Environments for Multi-Agent Systems: Enabling agents to use Web of Things via semantic web

arXiv.org Artificial Intelligence

The Web is ubiquitous, increasingly populated with interconnected data, services, people, and objects. Semantic web technologies (SWT) promote uniformity of data formats, as well as modularization and reuse of specifications (e.g., ontologies), by allowing them to include and refer to information provided by other ontologies. In such a context, multi-agent system (MAS) technologies are the right abstraction for developing decentralized and open Web applications in which agents discover, reason and act on Web resources and cooperate with each other and with people. The aim of the project is to propose an approach to transform "Agent and artifact (A&A) meta-model" into a Web-readable format with ontologies in line with semantic web formats and to reuse already existing ontologies in order to provide uniform access for agents to things.


`Why not give this work to them?' Explaining AI-Moderated Task-Allocation Outcomes using Negotiation Trees

arXiv.org Artificial Intelligence

The problem of multi-agent task allocation arises in a variety of scenarios involving human teams. In many such settings, human teammates may act with selfish motives and try to minimize their cost metrics. In the absence of (1) complete knowledge about the reward of other agents and (2) the team's overall cost associated with a particular allocation outcome, distributed algorithms can only arrive at sub-optimal solutions within a reasonable amount of time. To address these challenges, we introduce the notion of an AI Task Allocator (AITA) that, with complete knowledge, comes up with fair allocations that strike a balance between the individual human costs and the team's performance cost. To ensure that AITA is explicable to the humans, we allow each human agent to question AITA's proposed allocation with counterfactual allocations. In response, we design AITA to provide a replay negotiation tree that acts as an explanation showing why the counterfactual allocation, with the correct costs, will eventually result in a sub-optimal allocation. This explanation also updates a human's incomplete knowledge about their teammate's and the team's actual costs. We then investigate whether humans are (1) able to understand the explanations provided and (2) convinced by it using human factor studies. Finally, we show the effect of various kinds of incompleteness on the length of explanations. We conclude that underestimation of other's costs often leads to the need for explanations and in turn, longer explanations on average.


AI At The Edge: Creating Coordinated Autonomy

#artificialintelligence

Today organizations have to deal with so many emergent behaviors that the notion of central control as the only coping mechanism seems to be receding as a dominant management model. Freedom must be doled out further from the centrist idea by creating goals, constraints, boundaries and allowable edge behaviors. Someday software and hardware agents will negotiate their contribution to business outcomes on their own, but until then organizations will have to prepare themselves by managing coordinated autonomy. Edge computing is a form of distributed computing which brings computation and data storage closer to the location where it is needed, to improve response times and provide better actions. Now, AI on Edge, can offer a whole lot of new possibilities.


Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable Edge Computing Systems

arXiv.org Machine Learning

The stringent requirements of mobile edge computing (MEC) applications and functions fathom the high capacity and dense deployment of MEC hosts to the upcoming wireless networks. However, operating such high capacity MEC hosts can significantly increase energy consumption. Thus, a BS unit can act as a self-powered BS. In this paper, an effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied. First, a two-stage linear stochastic programming problem is formulated with the goal of minimizing the total energy consumption cost of the system while fulfilling the energy demand. Second, a semi-distributed data-driven solution is proposed by developing a novel multi-agent meta-reinforcement learning (MAMRL) framework to solve the formulated problem. In particular, each BS plays the role of a local agent that explores a Markovian behavior for both energy consumption and generation while each BS transfers time-varying features to a meta-agent. Sequentially, the meta-agent optimizes (i.e., exploits) the energy dispatch decision by accepting only the observations from each local agent with its own state information. Meanwhile, each BS agent estimates its own energy dispatch policy by applying the learned parameters from meta-agent. Finally, the proposed MAMRL framework is benchmarked by analyzing deterministic, asymmetric, and stochastic environments in terms of non-renewable energy usages, energy cost, and accuracy. Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost (with 95.8% prediction accuracy), compared to other baseline methods.


From Poincar\'e Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

arXiv.org Machine Learning

In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG). We generalize existing results of Poincar\'e recurrence from normal-form games to zero-sum two-player imperfect information games and other sequential game settings. We then investigate how adapting the reward (by adding a regularization term) of the game can give strong convergence guarantees in monotone games. We continue by showing how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium. Finally, we show how these insights can be directly used to build state-of-the-art model-free algorithms for zero-sum two-player Imperfect Information Games (IIG).


Stochastic Regret Minimization in Extensive-Form Games

arXiv.org Artificial Intelligence

Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full tree traversals. It works by using gradient estimates that can be computed via sampling. However, stochastic methods for sequential games have not been investigated extensively beyond MCCFR. In this paper we develop a new framework for developing stochastic regret minimization methods. This framework allows us to use any regret-minimization algorithm, coupled with any gradient estimator. The MCCFR algorithm can be analyzed as a special case of our framework, and this analysis leads to significantly-stronger theoretical on convergence, while simultaneously yielding a simplified proof. Our framework allows us to instantiate several new stochastic methods for solving sequential games. We show extensive experiments on three games, where some variants of our methods outperform MCCFR.


Using AI for Mitigating the Impact of Network Delay in Cloud-based Intelligent Traffic Signal Control

arXiv.org Artificial Intelligence

The recent advancements in cloud services, Internet of Things (IoT) and Cellular networks have made cloud computing an attractive option for intelligent traffic signal control (ITSC). Such a method significantly reduces the cost of cables, installation, number of devices used, and maintenance. ITSC systems based on cloud computing lower the cost of the ITSC systems and make it possible to scale the system by utilizing the existing powerful cloud platforms. While such systems have significant potential, one of the critical problems that should be addressed is the network delay. It is well known that network delay in message propagation is hard to prevent, which could potentially degrade the performance of the system or even create safety issues for vehicles at intersections. In this paper, we introduce a new traffic signal control algorithm based on reinforcement learning, which performs well even under severe network delay. The framework introduced in this paper can be helpful for all agent-based systems using remote computing resources where network delay could be a critical concern. Extensive simulation results obtained for different scenarios show the viability of the designed algorithm to cope with network delay.


Sequential Cooperative Bayesian Inference

arXiv.org Machine Learning

Learning often occurs sequentially, as opposed to in batch, and from data provided by other agents, as opposed to from a fixed random sampling process. The canonical example of sequential learning from an agent occurs in educational contexts where the other agent is a teacher whose goal is to help the learner. However, instances appear across a wide range of contexts including informal learning, language, and robotics. In contrast with typical contexts considered in machine learning, it is reasonable to expect the cooperative agent to adapt their sampling process after each trial, consistent with the goal of helping the learner learn more quickly. It is also reasonable to expect that learners, in dealing with such cooperative agents, would know the other agent intends to cooperate and incorporate that knowledge when updating their beliefs.


R-MADDPG for Partially Observable Environments and Limited Communication

arXiv.org Artificial Intelligence

There are several real-world tasks that would benefit from applying multiagent reinforcement learning (MARL) algorithms, including the coordination among self-driving cars. The real world has challenging conditions for multiagent learning systems, such as its partial observable and nonstationary nature. Moreover, if agents must share a limited resource (e.g. network bandwidth) they must all learn how to coordinate resource use. This paper introduces a deep recurrent multiagent actor-critic framework (R-MADDPG) for handling multiagent coordination under partial observable set-tings and limited communication. We investigate recurrency effects on performance and communication use of a team of agents. We demonstrate that the resulting framework learns time dependencies for sharing missing observations, handling resource limitations, and developing different communication patterns among agents.