Goto

Collaborating Authors

 link capacity



M3Net: A Multi-Metric Mixture of Experts Network Digital Twin with Graph Neural Networks

Guda, Blessed, Joe-Wong, Carlee

arXiv.org Artificial Intelligence

Abstract--The rise of 5G/6G network technologies promises to enable applications like autonomous vehicles and virtual reality, resulting in a significant increase in connected devices and necessarily complicating network management. Even worse, these applications often have strict, yet heterogeneous, performance requirements across metrics like latency and reliability. Much recent work has thus focused on developing the ability to predict network performance. However, traditional methods for network modeling, like discrete event simulators and emulation, often fail to balance accuracy and scalability. Network Digital Twins (NDTs), augmented by machine learning, present a viable solution by creating virtual replicas of physical networks for real-time simulation and analysis. State-of-the-art models, however, fall short of full-fledged NDTs, as they often focus only on a single performance metric or simulated network data. We introduce M3Net, a Multi-Metric Mixture-of-experts (MoE) NDT that uses a graph neural network architecture to estimate multiple performance metrics from an expanded set of network state data in a range of scenarios. We show that M3Net significantly enhances the accuracy of flow delay predictions by reducing the MAPE (Mean Absolute Percentage Error) from 20.06% to 17.39%, while also achieving 66.47% and 78.7% accuracy on jitter and packets dropped for each flow. Emerging 5G and 6G mobile network architectures aim to support new applications like autonomous vehicles and mixed reality [1], [2], both of which require significantly expanded network capabilities. These and other new applications envisioned as part of the 5G and 6G network ecosystem will lead to massive numbers of connected devices with heterogeneous performance expectations, which increases the complexity and cost of managing communication networks [2]. For example, interactive applications like augmented reality generally require response latencies under 200ms [3], while safety-critical applications like autonomous vehicles might require highly reliable delivery of high-priority packets [4].


References [1 ]

Neural Information Processing Systems

Mahmoud Assran et al. "Stochastic Gradient Push for Distributed Deep Learning". Keith Bonawitz et al. "Practical secure aggregation for privacy-preserving machine learning". Pierre Courtiol et al. "Deep learning-based classification of mesothelioma improves prediction "Distributed nonconvex optimization over time-varying networks". "Dual Averaging for Distributed Optimization: Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. "Model inversion attacks that exploit Transactions on The Built Environment 37 (1998). Zhanhong Jiang et al. "Collaborative deep learning in fixed topology networks". Can Karakus et al. "Straggler Mitigation in Distributed Optimization Through Data Encoding". "Federated Optimization:Distributed Optimization Beyond the Datacenter". Jakub Konecný et al. "Federated Optimization: Distributed Machine Learning for On-Device Songze Li et al. "Near-Optimal Straggler Mitigation for Distributed Gradient Methods".




Teal: Learning-Accelerated Optimization of WAN Traffic Engineering

Xu, Zhiying, Yan, Francis Y., Singh, Rachee, Chiu, Justin T., Rush, Alexander M., Yu, Minlan

arXiv.org Artificial Intelligence

The rapid expansion of global cloud wide-area networks (WANs) has posed a challenge for commercial optimization engines to efficiently solve network traffic engineering (TE) problems at scale. Existing acceleration strategies decompose TE optimization into concurrent subproblems but realize limited parallelism due to an inherent tradeoff between run time and allocation performance. We present Teal, a learning-based TE algorithm that leverages the parallel processing power of GPUs to accelerate TE control. First, Teal designs a flow-centric graph neural network (GNN) to capture WAN connectivity and network flows, learning flow features as inputs to downstream allocation. Second, to reduce the problem scale and make learning tractable, Teal employs a multi-agent reinforcement learning (RL) algorithm to independently allocate each traffic demand while optimizing a central TE objective. Finally, Teal fine-tunes allocations with ADMM (Alternating Direction Method of Multipliers), a highly parallelizable optimization algorithm for reducing constraint violations such as overutilized links. We evaluate Teal using traffic matrices from Microsoft's WAN. On a large WAN topology with >1,700 nodes, Teal generates near-optimal flow allocations while running several orders of magnitude faster than the production optimization engine. Compared with other TE acceleration schemes, Teal satisfies 6--32% more traffic demand and yields 197--625x speedups.


Neural Quantile Optimization for Edge-Cloud Computing

Du, Bin, Zhang, He, Cheng, Xiangle, Zhang, Lei

arXiv.org Artificial Intelligence

We seek the best traffic allocation scheme for the edge-cloud computing network that satisfies constraints and minimizes the cost based on burstable billing. First, for a fixed network topology, we formulate a family of integer programming problems with random parameters describing the various traffic demands. Then, to overcome the difficulty caused by the discrete feature of the problem, we generalize the Gumbel-softmax reparameterization method to induce an unconstrained continuous optimization problem as a regularized continuation of the discrete problem. Finally, we introduce the Gumbel-softmax sampling network to solve the optimization problems via unsupervised learning. The network structure reflects the edge-cloud computing topology and is trained to minimize the expectation of the cost function for unconstrained continuous optimization problems. The trained network works as an efficient traffic allocation scheme sampler, remarkably outperforming the random strategy in feasibility and cost function value. Besides testing the quality of the output allocation scheme, we examine the generalization property of the network by increasing the time steps and the number of users. We also feed the solution to existing integer optimization solvers as initial conditions and verify the warm-starts can accelerate the short-time iteration process. The framework is general with solid performance, and the decoupled feature of the random neural networks is adequate for practical implementations.


Proactive Resilient Transmission and Scheduling Mechanisms for mmWave Networks

Dogan, Mine Gokce, Cardone, Martina, Fragouli, Christina

arXiv.org Artificial Intelligence

This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in an arbitrary millimeter-wave (mmWave) network. The main contributions include: (a) the development of proactive transmission mechanisms that build resilience against network disruptions in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm that efficiently selects (in polynomial time in the network size) multiple proactively resilient paths with high packet rates; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blocked links and failed paths. To achieve resilience to link failures, a stateof-the-art Soft Actor-Critic DRL algorithm, which adapts the information flow through the network, is investigated. The proposed scheduling algorithm robustly adapts to link failures over different topologies, channel and blockage realizations while offering a superior performance to alternative algorithms. M. G. Dogan and C. Fragouli are with the Electrical and Computer Engineering Department at the University of California, Los Angeles, CA 90095 USA (e-mail: {minedogan96, christina.fragouli}@ucla.edu). The research carried out at UCLA was supported in part by the Army Research Laboratory under Co-Operative Agreement W911NF-17-2-0196 and by the U.S. National Science Foundation (NSF) awards 442521-FC-22071 and 442521-FC-21454. M. Cardone is with the Electrical and Computer Engineering Department of the University of Minnesota, MN 55404 USA (e-mail: cardo089@umn.edu). The work of M. Cardone was supported in part by the NSF under Grants CCF-2045237 and CNS-2146838. Part of this work was presented at the 2021 IEEE Military Communications Conference [1] and at the 2022 IEEE International Symposium on Information Theory [2]. Millimeter Wave (mmWave) (and beyond) is an enabling technology that is playing an increasingly important role in our wireless infrastructure by expanding the available spectrum and enabling multi-gigabit services [3]-[5]. A number of use cases are currently built around multihop mmWave networks, such as Facebook's Terragraph network [6] that uses flexible mmWave backbones to connect clusters of base stations. Other example scenarios include private networks, such as in shopping centers, airports and enterprises; mmWave mesh networks that use mmWave links as backhaul in dense urban scenarios; military applications employing mobile hot spots; and mmWave based vehicle-to-everything (V2X) services, such as cooperative perception [7]-[9].


Experiments with Neural Networks for Real Time Implementation of Control

Campbell, Peter K., Dale, Michael, Ferrá, Herman L., Kowalczyk, Adam

Neural Information Processing Systems

This paper describes a neural network based controller for allocating capacity in a telecommunications network. This system was proposed in order to overcome a "real time" response constraint. Two basic architectures are evaluated: 1) a feedforward network-heuristic and; 2) a feedforward network-recurrent network. These architectures are compared against a linear programming (LP) optimiser as a benchmark. This LP optimiser was also used as a teacher to label the data samples for the feedforward neural network training algorithm. It is found that the systems are able to provide a traffic throughput of 99% and 95%, respectively, of the throughput obtained by the linear programming solution. Once trained, the neural network based solutions are found in a fraction of the time required by the LP optimiser.


Experiments with Neural Networks for Real Time Implementation of Control

Campbell, Peter K., Dale, Michael, Ferrá, Herman L., Kowalczyk, Adam

Neural Information Processing Systems

This paper describes a neural network based controller for allocating capacity in a telecommunications network. This system was proposed in order to overcome a "real time" response constraint. Two basic architectures are evaluated: 1) a feedforward network-heuristic and; 2) a feedforward network-recurrent network. These architectures are compared against a linear programming (LP) optimiser as a benchmark. This LP optimiser was also used as a teacher to label the data samples for the feedforward neural network training algorithm. It is found that the systems are able to provide a traffic throughput of 99% and 95%, respectively, of the throughput obtained by the linear programming solution. Once trained, the neural network based solutions are found in a fraction of the time required by the LP optimiser.