Collaborating Authors

Applications of Multi-Agent Reinforcement Learning in Future Internet: A Comprehensive Survey Artificial Intelligence

Future Internet involves several emerging technologies such as 5G and beyond 5G networks, vehicular networks, unmanned aerial vehicle (UAV) networks, and Internet of Things (IoTs). Moreover, future Internet becomes heterogeneous and decentralized with a large number of involved network entities. Each entity may need to make its local decision to improve the network performance under dynamic and uncertain network environments. Standard learning algorithms such as single-agent Reinforcement Learning (RL) or Deep Reinforcement Learning (DRL) have been recently used to enable each network entity as an agent to learn an optimal decision-making policy adaptively through interacting with the unknown environments. However, such an algorithm fails to model the cooperations or competitions among network entities, and simply treats other entities as a part of the environment that may result in the non-stationarity issue. Multi-agent Reinforcement Learning (MARL) allows each network entity to learn its optimal policy by observing not only the environments, but also other entities' policies. As a result, MARL can significantly improve the learning efficiency of the network entities, and it has been recently used to solve various issues in the emerging networks. In this paper, we thus review the applications of MARL in the emerging networks. In particular, we provide a tutorial of MARL and a comprehensive survey of applications of MARL in next generation Internet. In particular, we first introduce single-agent RL and MARL. Then, we review a number of applications of MARL to solve emerging issues in future Internet. The issues consist of network access, transmit power control, computation offloading, content caching, packet routing, trajectory design for UAV-aided networks, and network security issues.

When Deep Reinforcement Learning Meets Federated Learning: Intelligent Multi-Timescale Resource Management for Multi-access Edge Computing in 5G Ultra Dense Network Artificial Intelligence

Ultra-dense edge computing (UDEC) has great potential, especially in the 5G era, but it still faces challenges in its current solutions, such as the lack of: i) efficient utilization of multiple 5G resources (e.g., computation, communication, storage and service resources); ii) low overhead offloading decision making and resource allocation strategies; and iii) privacy and security protection schemes. Thus, we first propose an intelligent ultra-dense edge computing (I-UDEC) framework, which integrates blockchain and Artificial Intelligence (AI) into 5G ultra-dense edge computing networks. First, we show the architecture of the framework. Then, in order to achieve real-time and low overhead computation offloading decisions and resource allocation strategies, we design a novel two-timescale deep reinforcement learning (\textit{2Ts-DRL}) approach, consisting of a fast-timescale and a slow-timescale learning process, respectively. The primary objective is to minimize the total offloading delay and network resource usage by jointly optimizing computation offloading, resource allocation and service caching placement. We also leverage federated learning (FL) to train the \textit{2Ts-DRL} model in a distributed manner, aiming to protect the edge devices' data privacy. Simulation results corroborate the effectiveness of both the \textit{2Ts-DRL} and FL in the I-UDEC framework and prove that our proposed algorithm can reduce task execution time up to 31.87%.

A Review on Computational Intelligence Techniques in Cloud and Edge Computing Artificial Intelligence

Cloud computing (CC) is a centralized computing paradigm that accumulates resources centrally and provides these resources to users through Internet. Although CC holds a large number of resources, it may not be acceptable by real-time mobile applications, as it is usually far away from users geographically. On the other hand, edge computing (EC), which distributes resources to the network edge, enjoys increasing popularity in the applications with low-latency and high-reliability requirements. EC provides resources in a decentralized manner, which can respond to users' requirements faster than the normal CC, but with limited computing capacities. As both CC and EC are resource-sensitive, several big issues arise, such as how to conduct job scheduling, resource allocation, and task offloading, which significantly influence the performance of the whole system. To tackle these issues, many optimization problems have been formulated. These optimization problems usually have complex properties, such as non-convexity and NP-hardness, which may not be addressed by the traditional convex optimization-based solutions. Computational intelligence (CI), consisting of a set of nature-inspired computational approaches, recently exhibits great potential in addressing these optimization problems in CC and EC. This paper provides an overview of research problems in CC and EC and recent progresses in addressing them with the help of CI techniques. Informative discussions and future research trends are also presented, with the aim of offering insights to the readers and motivating new research directions.

Deep Reinforcement Learning for Stochastic Computation Offloading in Digital Twin Networks Artificial Intelligence

The rapid development of Industrial Internet of Things (IIoT) requires industrial production towards digitalization to improve network efficiency. Digital Twin is a promising technology to empower the digital transformation of IIoT by creating virtual models of physical objects. However, the provision of network efficiency in IIoT is very challenging due to resource-constrained devices, stochastic tasks, and resources heterogeneity. Distributed resources in IIoT networks can be efficiently exploited through computation offloading to reduce energy consumption while enhancing data processing efficiency. In this paper, we first propose a new paradigm Digital Twin Networks (DTN) to build network topology and the stochastic task arrival model in IIoT systems. Then, we formulate the stochastic computation offloading and resource allocation problem to minimize the long-term energy efficiency. As the formulated problem is a stochastic programming problem, we leverage Lyapunov optimization technique to transform the original problem into a deterministic per-time slot problem. Finally, we present Asynchronous Actor-Critic (AAC) algorithm to find the optimal stochastic computation offloading policy. Illustrative results demonstrate that our proposed scheme is able to significantly outperforms the benchmarks.

Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning Approach Machine Learning

Mobile edge computing (MEC) emerges recently as a promising solution to relieve resource-limited mobile devices from computation-intensive tasks, which enables devices to offload workloads to nearby MEC servers and improve the quality of computation experience. Nevertheless, by considering an MEC system consisting of multiple mobile users with stochastic task arrivals and wireless channels in this paper, the design of computation offloading policies is challenging to minimize the long-term average computation cost in terms of power consumption and buffering delay. A deep reinforcement learning (DRL) based decentralized dynamic computation offloading strategy is investigated to build a scalable MEC system with limited feedback. Specifically, a continuous action space based DRL approach named deep deterministic policy gradient (DDPG) is adopted to learn efficient computation offloading policies independently at each mobile user. Thus, powers of both local execution and task offloading can be adaptively allocated by the learned policies from each user's local observation of the MEC system. Numerical results are illustrated to demonstrate that efficient policies can be learned at each user, and performance of the proposed DDPG based decentralized strategy outperforms the conventional deep Q-network (DQN) based discrete power control strategy and some other greedy strategies with reduced computation cost. Besides, the power-delay tradeoff is also analyzed for both the DDPG based and DQN based strategies.