Goto

Collaborating Authors

 Telecommunications


Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for MANETs

arXiv.org Artificial Intelligence

We address the packet routing problem in highly dynamic mobile ad-hoc networks (MANETs). In the network routing problem each router chooses the next-hop(s) of each packet to deliver the packet to a destination with lower delay, higher reliability, and less overhead in the network. In this paper, we present a novel framework and routing policies, DeepCQ+ routing, using multi-agent deep reinforcement learning (MADRL) which is designed to be robust and scalable for MANETs. Unlike other deep reinforcement learning (DRL)-based routing solutions in the literature, our approach has enabled us to train over a limited range of network parameters and conditions, but achieve realistic routing policies for a much wider range of conditions including a variable number of nodes, different data flows with varying data rates and source/destination pairs, diverse mobility levels, and other dynamic topology of networks. We demonstrate the scalability, robustness, and performance enhancements obtained by DeepCQ+ routing over a recently proposed model-free and non-neural robust and reliable routing technique (i.e. CQ+ routing). DeepCQ+ routing outperforms non-DRL-based CQ+ routing in terms of overhead while maintains same goodput rate. Under a wide range of network sizes and mobility conditions, we have observed the reduction in normalized overhead of 10-15%, indicating that the DeepCQ+ routing policy delivers more packets end-to-end with less overhead used. To the best of our knowledge, this is the first successful application of MADRL for the MANET routing problem that simultaneously achieves scalability and robustness under dynamic conditions while outperforming its non-neural counterpart. More importantly, we provide a framework to design scalable and robust routing policy with any desired network performance metric of interest.


Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective

arXiv.org Artificial Intelligence

In this paper, we study the problem of mobile user profiling, which is a critical component for quantifying users' characteristics in the human mobility modeling pipeline. Human mobility is a sequential decision-making process dependent on the users' dynamic interests. With accurate user profiles, the predictive model can perfectly reproduce users' mobility trajectories. In the reverse direction, once the predictive model can imitate users' mobility patterns, the learned user profiles are also optimal. Such intuition motivates us to propose an imitation-based mobile user profiling framework by exploiting reinforcement learning, in which the agent is trained to precisely imitate users' mobility patterns for optimal user profiles. Specifically, the proposed framework includes two modules: (1) representation module, which produces state combining user profiles and spatio-temporal context in real-time; (2) imitation module, where Deep Q-network (DQN) imitates the user behavior (action) based on the state that is produced by the representation module. However, there are two challenges in running the framework effectively. First, epsilon-greedy strategy in DQN makes use of the exploration-exploitation trade-off by randomly pick actions with the epsilon probability. Such randomness feeds back to the representation module, causing the learned user profiles unstable. To solve the problem, we propose an adversarial training strategy to guarantee the robustness of the representation module. Second, the representation module updates users' profiles in an incremental manner, requiring integrating the temporal effects of user profiles. Inspired by Long-short Term Memory (LSTM), we introduce a gated mechanism to incorporate new and old user characteristics into the user profile.


Learning-Based Distributed Random Access for Multi-Cell IoT Networks with NOMA

arXiv.org Artificial Intelligence

Non-orthogonal multiple access (NOMA) is a key technology to enable massive machine type communications (mMTC) in 5G networks and beyond. In this paper, NOMA is applied to improve the random access efficiency in high-density spatially-distributed multi-cell wireless IoT networks, where IoT devices contend for accessing the shared wireless channel using an adaptive p-persistent slotted Aloha protocol. To enable a capacity-optimal network, a novel formulation of random channel access with NOMA is proposed, in which the transmission probability of each IoT device is tuned to maximize the geometric mean of users' expected capacity. It is shown that the network optimization objective is high dimensional and mathematically intractable, yet it admits favourable mathematical properties that enable the design of efficient learning-based algorithmic solutions. To this end, two algorithms, i.e., a centralized model-based algorithm and a scalable distributed model-free algorithm, are proposed to optimally tune the transmission probabilities of IoT devices to attain the maximum capacity. The convergence of the proposed algorithms to the optimal solution is further established based on convex optimization and game-theoretic analysis. Extensive simulations demonstrate the merits of the novel formulation and the efficacy of the proposed algorithms.


Relational Deep Reinforcement Learning for Routing in Wireless Networks

arXiv.org Artificial Intelligence

While routing in wireless networks has been studied extensively, existing protocols are typically designed for a specific set of network conditions and so cannot accommodate any drastic changes in those conditions. For instance, protocols designed for connected networks cannot be easily applied to disconnected networks. In this paper, we develop a distributed routing strategy based on deep reinforcement learning that generalizes to diverse traffic patterns, congestion levels, network connectivity, and link dynamics. We make the following key innovations in our design: (i) the use of relational features as inputs to the deep neural network approximating the decision space, which enables our algorithm to generalize to diverse network conditions, (ii) the use of packet-centric decisions to transform the routing problem into an episodic task by viewing packets, rather than wireless devices, as reinforcement learning agents, which provides a natural way to propagate and model rewards accurately during learning, and (iii) the use of extended-time actions to model the time spent by a packet waiting in a queue, which reduces the amount of training data needed and allows the learning algorithm to converge more quickly. We evaluate our routing algorithm using a packet-level simulator and show that the policy our algorithm learns during training is able to generalize to larger and more congested networks, different topologies, and diverse link dynamics. Our algorithm outperforms shortest path and backpressure routing with respect to packets delivered and delay per packet.


Towards User Scheduling for 6G: A Fairness-Oriented Scheduler Using Multi-Agent Reinforcement Learning

arXiv.org Artificial Intelligence

User scheduling is a classical problem and key technology in wireless communication, which will still plays an important role in the prospective 6G. There are many sophisticated schedulers that are widely deployed in the base stations, such as Proportional Fairness (PF) and Round-Robin Fashion (RRF). It is known that the Opportunistic (OP) scheduling is the optimal scheduler for maximizing the average user data rate (AUDR) considering the full buffer traffic. But the optimal strategy achieving the highest fairness still remains largely unknown both in the full buffer traffic and the bursty traffic. In this work, we investigate the problem of fairness-oriented user scheduling, especially for the RBG allocation. We build a user scheduler using Multi-Agent Reinforcement Learning (MARL), which conducts distributional optimization to maximize the fairness of the communication system. The agents take the cross-layer information (e.g. RSRP, Buffer size) as state and the RBG allocation result as action, then explore the optimal solution following a well-defined reward function designed for maximizing fairness. Furthermore, we take the 5%-tile user data rate (5TUDR) as the key performance indicator (KPI) of fairness, and compare the performance of MARL scheduling with PF scheduling and RRF scheduling by conducting extensive simulations. And the simulation results show that the proposed MARL scheduling outperforms the traditional schedulers.


Millimeter Wave Sensing: A Review of Application Pipelines and Building Blocks

arXiv.org Artificial Intelligence

The increasing bandwidth requirement of new wireless applications has lead to standardization of the millimeter wave spectrum for high-speed wireless communication. The millimeter wave spectrum is part of 5G and covers frequencies between 30 and 300 GHz corresponding to wavelengths ranging from 10 to 1 mm. Although millimeter wave is often considered as a communication medium, it has also proved to be an excellent 'sensor', thanks to its narrow beams, operation across a wide bandwidth, and interaction with atmospheric constituents. In this paper, which is to the best of our knowledge the first review that completely covers millimeter wave sensing application pipelines, we provide a comprehensive overview and analysis of different basic application pipeline building blocks, including hardware, algorithms, analytical models, and model evaluation techniques. The review also provides a taxonomy that highlights different millimeter wave sensing application domains. By performing a thorough analysis, complying with the systematic literature review methodology and reviewing 165 papers, we not only extend previous investigations focused only on communication aspects of the millimeter wave technology and using millimeter wave technology for active imaging, but also highlight scientific and technological challenges and trends, and provide a future perspective for applications of millimeter wave as a sensing technology.


Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with Heterogeneous Learning Tasks

arXiv.org Artificial Intelligence

The ever-growing popularity and rapid improving of artificial intelligence (AI) have raised rethinking on the evolution of wireless networks. Mobile edge computing (MEC) provides a natural platform for AI applications since it is with rich computation resources to train machine learning (ML) models, as well as low-latency access to the data generated by mobile and internet of things (IoT) devices. In this paper, we present an infrastructure to perform ML tasks at an MEC server with the assistance of a reconfigurable intelligent surface (RIS). In contrast to conventional communication systems where the principal criterions are to maximize the throughput, we aim at maximizing the learning performance. Specifically, we minimize the maximum learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station (BS), and the phase-shift matrix of the RIS. An alternating optimization (AO)-based framework is proposed to optimize the three terms iteratively, where a successive convex approximation (SCA)-based algorithm is developed to solve the power allocation problem, closed-form expressions of the beamforming vectors are derived, and an alternating direction method of multipliers (ADMM)-based algorithm is designed together with an error level searching (ELS) framework to effectively solve the challenging nonconvex optimization problem of the phase-shift matrix. Simulation results demonstrate significant gains of deploying an RIS and validate the advantages of our proposed algorithms over various benchmarks. Lastly, a unified communication-training-inference platform is developed based on the CARLA platform and the SECOND network, and a use case (3D object detection in autonomous driving) for the proposed scheme is demonstrated on the developed platform.


Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms

arXiv.org Machine Learning

We describe the convex semi-infinite dual of the two-layer vector-output ReLU neural network training problem. This semi-infinite dual admits a finite dimensional representation, but its support is over a convex set which is difficult to characterize. In particular, we demonstrate that the non-convex neural network training problem is equivalent to a finite-dimensional convex copositive program. Our work is the first to identify this strong connection between the global optima of neural networks and those of copositive programs. We thus demonstrate how neural networks implicitly attempt to solve copositive programs via semi-nonnegative matrix factorization, and draw key insights from this formulation. We describe the first algorithms for provably finding the global minimum of the vector output neural network training problem, which are polynomial in the number of samples for a fixed data rank, yet exponential in the dimension. However, in the case of convolutional architectures, the computational complexity is exponential in only the filter size and polynomial in all other parameters. We describe the circumstances in which we can find the global optimum of this neural network training problem exactly with soft-thresholded SVD, and provide a copositive relaxation which is guaranteed to be exact for certain classes of problems, and which corresponds with the solution of Stochastic Gradient Descent in practice.


SkyCore

Communications of the ACM

Evolved packet core (EPC, Figure 5) is a distributed system of different nodes, each consisting of diverse network functions (NFs) that are required to manage the LTE network. The EPC consists of data and control data planes: the data plane enforces operator policies (e.g., DPI, QoS classes, and accounting) on data traffic to/from the user equipment (UE), whereas the control plane provides key control and management functions such as access control, mobility, and security management.


EQ-Net: A Unified Deep Learning Framework for Log-Likelihood Ratio Estimation and Quantization

arXiv.org Machine Learning

In this work, we introduce EQ-Net: the first holistic framework that solves both the tasks of loglikelihood ratio (LLR) estimation and quantization using a data-driven method. We motivate our approach with theoretical insights on two practical estimation algorithms at the ends of the complexity spectrum and reveal a connection between the complexity of an algorithm and the information bottleneck method: simpler algorithms admit smaller bottlenecks when representing their solution. This motivates us to propose a two-stage algorithm that uses LLR compression as a pretext task for estimation and is focused on low-latency, high-performance implementations via deep neural networks. We carry out extensive experimental evaluation and demonstrate that our single architecture achieves state-of-the-art results on both tasks when compared to previous methods, with gains in quantization efficiency as high as 20% and reduced estimation latency by up to 60% when measured on general purpose and graphical processing units (GPU). In particular, our approach reduces the GPU inference latency by more than two times in several multiple-input multiple-output (MIMO) configurations. Finally, we demonstrate that our scheme is robust to distributional shifts and retains a significant part of its performance when evaluated on 5G channel models, as well as channel estimation errors. Recent years have seen deep learning methods mature and become more pervasive in communication networks research [1]. At the center of our goal and methods are two of the core tasks of any digital system: estimation and quantization.