Goto

Collaborating Authors

 Telecommunications


Towards Auto-Modeling of Formal Verification for NextG Protocols: A Multimodal cross- and self-attention Large Language Model Approach

arXiv.org Artificial Intelligence

This paper introduces Auto-modeling of Formal Verification with Real-world Prompting for 5G and NextG protocols (AVRE), a novel system designed for the formal verification of Next Generation (NextG) communication protocols, addressing the increasing complexity and scalability challenges in network protocol design and verification. Utilizing Large Language Models (LLMs), AVRE transforms protocol descriptions into dependency graphs and formal models, efficiently resolving ambiguities and capturing design intent. The system integrates a transformer model with LLMs to autonomously establish quantifiable dependency relationships through cross- and self-attention mechanisms. Enhanced by iterative feedback from the HyFuzz experimental platform, AVRE significantly advances the accuracy and relevance of formal verification in complex communication protocols, offering a groundbreaking approach to validating sophisticated communication systems. We compare CAL's performance with state-of-the-art LLM-based models and traditional time sequence models, demonstrating its superiority in accuracy and robustness, achieving an accuracy of 95.94\% and an AUC of 0.98. This NLP-based approach enables, for the first time, the creation of exploits directly from design documents, making remarkable progress in scalable system verification and validation.


On the Burstiness of Distributed Machine Learning Traffic

arXiv.org Artificial Intelligence

Traffic from distributed training of machine learning (ML) models makes up a large and growing fraction of the traffic mix in enterprise data centers. While work on distributed ML abounds, the network traffic generated by distributed ML has received little attention. Using measurements on a testbed network, we investigate the traffic characteristics generated by the training of the ResNet-50 neural network with an emphasis on studying its shortterm burstiness. For the latter we propose metrics that quantify traffic burstiness at different time scales. Our analysis reveals that distributed ML traffic exhibits a very high degree of burstiness on short time scales, exceeding a 60:1 peak-to-mean ratio on time intervals as long as 5 ms. We observe that training software orchestrates transmissions in such a way that burst transmissions from different sources within the same application do not result in congestion and packet losses. An extrapolation of the measurement data to multiple applications underscores the challenges of distributed ML traffic for congestion and flow control algorithms. This paper studies and analyzes the burstiness of traffic from training deep neural network (DNN) models as a root cause for short-lived surges of traffic, known as microbursts, that cause periods of high packet delay and loss in a data center network (DCN) even at a low utilization. Since microbursts occur at a time scale of less than a millisecond [1], traditional traffic control methods are not effective with avoiding packet losses in such scenarios. Research on microbursts in DCNs has suggested a range of potential root causes, including the inherent burstiness of application traffic, confluence of traffic flows to a common destination (fan-in, incast), offloading of protocol processing at hosts, and traffic control algorithms, such as packet scheduling and flow control [1]-[10]. While training of neural networks makes up a large fraction of the workload in data centers [11], to the best of our knowledge, there does not exist a detailed analysis of distributed ML traffic and its potential impact on the creation of microbursts. The vast majority of network traffic from training DNN models is due to the exchange of gradients of model parameters. As modern DNN models involve millions, and, in the case of large language models such as GPT, billions of parameters [12], the transmission of gradients creates huge data bursts. The measurement experiments are performed in a testbed network with a single switch with 100 Gbps line rates. We evaluate a server-based and a serverless mode of training. In server-based training, the nodes involved in the training, referred to as workers, exchange gradients with a dedicated server. Here, the transmissions to the server create a bottleneck.


A Novel Reinforcement Learning Routing Algorithm for Congestion Control in Complex Networks

arXiv.org Artificial Intelligence

Despite technological advancements, the significance of interdisciplinary subjects like complex networks has grown. Exploring communication within these networks is crucial, with traffic becoming a key concern due to the expanding population and increased need for connections. Congestion tends to originate in specific network areas but quickly proliferates throughout. Consequently, understanding the transition from a flow-free state to a congested state is vital. Numerous studies have delved into comprehending the emergence and control of congestion in complex networks, falling into three general categories: soft strategies, hard strategies, and resource allocation strategies. This article introduces a routing algorithm leveraging reinforcement learning to address two primary objectives: congestion control and optimizing path length based on the shortest path algorithm, ultimately enhancing network throughput compared to previous methods. Notably, the proposed method proves effective not only in Barab\'asi-Albert scale-free networks but also in other network models such as Watts-Strogatz (small-world) and Erd\"os-R\'enyi (random network). Simulation experiment results demonstrate that, across various traffic scenarios and network topologies, the proposed method can enhance efficiency criteria by up to 30% while reducing maximum node congestion by five times.


Mobility and Cost Aware Inference Accelerating Algorithm for Edge Intelligence

arXiv.org Artificial Intelligence

The edge intelligence (EI) has been widely applied recently. Spliting the model between device, edge server, and cloud can improve the performance of EI greatly. The model segmentation without user mobility has been investigated deeply by previous works. However, in most use cases of EI, the end devices are mobile. Only a few works have been carried out on this aspect. These works still have many issues, such as ignoring the energy consumption of mobile device, inappropriate network assumption, and low effectiveness on adaptiving user mobility, etc. Therefore, for addressing the disadvantages of model segmentation and resource allocation in previous works, we propose mobility and cost aware model segmentation and resource allocation algorithm for accelerating the inference at edge (MCSA). Specfically, in the scenario without user mobility, the loop interation gradient descent (Li-GD) algorithm is provided. When the mobile user has a large model inference task needs to be calculated, it will take the energy consumption of mobile user, the communication and computing resource renting cost, and the inference delay into account to find the optimal model segmentation and resource allocation strategy. In the scenario with user mobility, the mobiity aware Li-GD (MLi-GD) algorithm is proposed to calculate the optimal strategy. Then, the properties of the proposed algorithms are investigated, including convergence, complexity, and approximation ratio. The experimental results demonstrate the effectiveness of the proposed algorithms.


A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration

arXiv.org Artificial Intelligence

Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) over commodity platforms to offer low-cost deployment and bring the services closer to end-users. In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed that jointly controls the O-RAN functional splits, the allocated resources and hosting locations of the O-RAN/MEC services across geo-distributed platforms, and the routing for each O-RAN/MEC data flow. The goal is to minimize the long-term overall network operation cost and maximize the MEC performance criterion while adapting possibly time-varying O-RAN/MEC demands and resource availability. This orchestration problem is formulated as Markov decision process (MDP). However, the system consists of multiple BSs that share the same resources and serve heterogeneous demands, where their parameters have non-trivial relations. Consequently, finding the exact model of the underlying system is impractical, and the formulated MDP renders in a large state space with multi-dimensional discrete action. To address such modeling and dimensionality issues, a novel model-free RL agent is proposed for our solution framework. The agent is built from Double Deep Q-network (DDQN) that tackles the large state space and is then incorporated with action branching, an action decomposition method that effectively addresses the multi-dimensional discrete action with linear increase complexity. Further, an efficient exploration-exploitation strategy under a Bayesian framework using Thomson sampling is proposed to improve the learning performance and expedite its convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges faster) and increases the returned reward by 32\% than its non-Bayesian version.


High Efficiency Inference Accelerating Algorithm for NOMA-based Mobile Edge Computing

arXiv.org Artificial Intelligence

-- Splitting the inference model between device, edge server, and cloud can improve the performance of EI greatly. Additionally, the non - orthogonal multiple access (NOMA), which is the key supporting technologies of B5G/6G, ca n achieve massive connections and high spectrum efficiency. Motivated by the benefits of NOMA, integrating NOMA with model split in MEC to reduce the inference latency further becomes attractive. However, the NOMA based communication during split inference has not been properly considered in previous works. Therefore, in this paper, we integrate the NOMA into split inference in MEC, and p ropose the effective communication and computing resource allocation algorithm to accelerat e the model inference at edge . Specifically, when the mobile user has a large model inference task needed to be calculated in the NOMA - based MEC, it will take the energy consumption of both device and edge server and the inference latency into account to find the optimal model split s trategy, subchannel allocation strategy (uplink and downlink), and transmission power allocation strategy (uplink and downlink). Since the minimum inference delay and energy consumption cannot be satisfied simultaneously, and the variables of subchannel al location and model split are discrete, the gradient descent (GD) algorithm is adopted to find the optimal tradeoff between them. Moreover, the loop iteration GD approach (Li - GD) is proposed to reduce the complexity of GD algorithm that caused by the parame ter discrete. Additionally, the properties of the proposed algorithm are also investigated, which demonstrate the effectiveness of the proposed algorithms. The artificial intelligence has been widely used and changed our life greatly, such as metaverse [1 - 2], auto matic driving [2 - 4], image generation [5], etc. However, since the AI model is always large for achieving high accuracy, the computing resource that needed for these models are huge. Therefore, it is inappropriate to deploy these AI models on the mobile de vices, such as mobile phones and vehicles, in which the computing resource is quite limited.


Human-Centric Resource Allocation for the Metaverse With Multiaccess Edge Computing

arXiv.org Artificial Intelligence

Multi-access edge computing (MEC) is a promising solution to the computation-intensive, low-latency rendering tasks of the metaverse. However, how to optimally allocate limited communication and computation resources at the edge to a large number of users in the metaverse is quite challenging. In this paper, we propose an adaptive edge resource allocation method based on multi-agent soft actor-critic with graph convolutional networks (SAC-GCN). Specifically, SAC-GCN models the multi-user metaverse environment as a graph where each agent is denoted by a node. Each agent learns the interplay between agents by graph convolutional networks with self-attention mechanism to further determine the resource usage for one user in the metaverse. The effectiveness of SAC-GCN is demonstrated through the analysis of user experience, balance of resource allocation, and resource utilization rate by taking a virtual city park metaverse as an example. Experimental results indicate that SAC-GCN outperforms other resource allocation methods in improving overall user experience, balancing resource allocation, and increasing resource utilization rate by at least 27%, 11%, and 8%, respectively.


Humane AI Pin orders will start shipping in March

Engadget

The Humane AI Pin is expected to start shipping in March. On Friday, the company posted on X (Twitter) that "those who placed priority orders will receive their Ai Pins first when we begin shipping in March." The company had previously given an "early 2024" estimate for the screen-less wearable device designed to replace a smartphone. Humane, founded by former Apple employees Bethany Bongiorno and Imran Chaudhri, views the smartphone (still their ex-employer's bread and butter) as on its last legs. "The last era has plateaued," TechCrunch reported Chaudhri as saying in a November press briefing.


Hybrid-Task Meta-Learning: A Graph Neural Network Approach for Scalable and Transferable Bandwidth Allocation

arXiv.org Artificial Intelligence

In this paper, we develop a deep learning-based bandwidth allocation policy that is: 1) scalable with the number of users and 2) transferable to different communication scenarios, such as non-stationary wireless channels, different quality-of-service (QoS) requirements, and dynamically available resources. To support scalability, the bandwidth allocation policy is represented by a graph neural network (GNN), with which the number of training parameters does not change with the number of users. To enable the generalization of the GNN, we develop a hybrid-task meta-learning (HML) algorithm that trains the initial parameters of the GNN with different communication scenarios during meta-training. Next, during meta-testing, a few samples are used to fine-tune the GNN with unseen communication scenarios. Simulation results demonstrate that our HML approach can improve the initial performance by $8.79\%$, and sampling efficiency by $73\%$, compared with existing benchmarks. After fine-tuning, our near-optimal GNN-based policy can achieve close to the same reward with much lower inference complexity compared to the optimal policy obtained using iterative optimization.


Dynamic Routing for Integrated Satellite-Terrestrial Networks: A Constrained Multi-Agent Reinforcement Learning Approach

arXiv.org Artificial Intelligence

The integrated satellite-terrestrial network (ISTN) system has experienced significant growth, offering seamless communication services in remote areas with limited terrestrial infrastructure. However, designing a routing scheme for ISTN is exceedingly difficult, primarily due to the heightened complexity resulting from the inclusion of additional ground stations, along with the requirement to satisfy various constraints related to satellite service quality. To address these challenges, we study packet routing with ground stations and satellites working jointly to transmit packets, while prioritizing fast communication and meeting energy efficiency and packet loss requirements. Specifically, we formulate the problem of packet routing with constraints as a max-min problem using the Lagrange method. Then we propose a novel constrained Multi-Agent reinforcement learning (MARL) dynamic routing algorithm named CMADR, which efficiently balances objective improvement and constraint satisfaction during the updating of policy and Lagrange multipliers. Finally, we conduct extensive experiments and an ablation study using the OneWeb and Telesat mega-constellations. Results demonstrate that CMADR reduces the packet delay by a minimum of 21% and 15%, while meeting stringent energy consumption and packet loss rate constraints, outperforming several baseline algorithms.