AITopics

2505.01186

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Automobiles & Trucks (0.68)
Telecommunications (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

arXiv.org Artificial IntelligenceMay-2-2025

Scaling On-Device GPU Inference for Large Generative Models

Tang, Jiuqiang, Sarokin, Raman, Ignasheva, Ekaterina, Jensen, Grant, Chen, Lin, Lee, Juhyun, Kulik, Andrei, Grundmann, Matthias

Driven by the advancements in generative AI, large machine learning models have revolutionized domains such as image processing, audio synthesis, and speech recognition. While server-based deployments remain the locus of peak performance, the imperative for on-device inference, necessitated by privacy and efficiency considerations, persists. Recognizing GPUs as the on-device ML accelerator with the widest reach, we present ML Drift--an optimized framework that extends the capabilities of state-of-the-art GPU-accelerated inference engines. ML Drift enables on-device execution of generative AI workloads which contain 10 to 100x more parameters than existing on-device generative AI models. ML Drift addresses intricate engineering challenges associated with cross-GPU API development, and ensures broad compatibility across mobile and desktop/laptop platforms, thereby facilitating the deployment of significantly more complex models on resource-constrained devices. Our GPU-accelerated ML/AI inference engine achieves an order-of-magnitude performance improvement relative to existing open-source GPU inference engines.

artificial intelligence, machine learning, natural language, (18 more...)

2505.00232

Genre: Research Report (0.82)

Industry:

Information Technology (1.00)
Telecommunications (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.64)

Nayar, Vinti, Sachdev, Kanica, Lall, Brejesh

Generative QoE Modeling: A Lightweight Approach for Telecom Networks

arXiv.org Artificial IntelligenceMay-1-2025

Quality of Experience (QoE) prediction plays a crucial role in optimizing resource management and enhancing user satisfaction across both telecommunication and OTT services. While recent advances predominantly rely on deep learning models, this study introduces a lightweight generative modeling framework that balances computational efficiency, interpretability, and predictive accuracy. By validating the use of Vector Quantization (VQ) as a preprocessing technique, continuous network features are effectively transformed into discrete categorical symbols, enabling integration with a Hidden Markov Model (HMM) for temporal sequence modeling. This VQ-HMM pipeline enhances the model's capacity to capture dynamic QoE patterns while supporting probabilistic inference on new and unseen data. Experimental results on publicly available time-series datasets incorporating both objective indicators and subjective QoE scores demonstrate the viability of this approach in real-time and resource-constrained environments, where inference latency is also critical. The framework offers a scalable alternative to complex deep learning methods, particularly in scenarios with limited computational resources or where latency constraints are critical.

artificial intelligence, deep learning, machine learning, (17 more...)

2504.21353

Genre: Research Report (0.65)

Industry:

Telecommunications > Networks (0.48)
Information Technology > Networks (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

arXiv.org Artificial IntelligenceMay-1-2025

Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A Survey

Hady, Mohamad A., Hu, Siyi, Pratama, Mahardhika, Cao, Jimmy, Kowalczyk, Ryszard

Multi-Agent Reinforcement Learning (MARL) has become a powerful framework for numerous real-world applications, modeling distributed decision-making and learning from interactions with complex environments. Resource Allocation Optimization (RAO) benefits significantly from MARL's ability to tackle dynamic and decentralized contexts. MARL-based approaches are increasingly applied to RAO challenges across sectors playing pivotal roles to Industry 4.0 developments. This survey provides a comprehensive review of recent MARL algorithms for RAO, encompassing core concepts, classifications, and a structured taxonomy. By outlining the current research landscape and identifying primary challenges and future directions, this survey aims to support researchers and practitioners in leveraging MARL's potential to advance resource allocation solutions.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2504.21048

Country:

Europe (0.45)
North America > United States (0.28)
Oceania > Australia (0.28)

Genre: Overview (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Telecommunications (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceApr-30-2025

Towards Easy and Realistic Network Infrastructure Testing for Large-scale Machine Learning

Yoo, Jinsun, Lao, ChonLam, Cao, Lianjie, Lantz, Bob, Yu, Minlan, Krishna, Tushar, Sharma, Puneet

This paper lays the foundation for Genie, a testing framework that captures the impact of real hardware network behavior on ML workload performance, without requiring expensive GPUs. Genie uses CPU-initiated traffic over a hardware testbed to emulate GPU to GPU communication, and adapts the ASTRA-sim simulator to model interaction between the network and the ML workload.

artificial intelligence, machine learning, workload, (19 more...)

2504.20854

Country: North America > United States (0.48)

Genre: Research Report (0.40)

Industry:

Information Technology (0.69)
Telecommunications > Networks (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Semiari, Omid, Nikopour, Hosein, Talwar, Shilpa

Graph Reinforcement Learning for QoS-Aware Load Balancing in Open Radio Access Networks

Next-generation wireless cellular networks are expected to provide unparalleled Quality-of-Service (QoS) for emerging wireless applications, necessitating strict performance guarantees, e.g., in terms of link-level data rates. A critical challenge in meeting these QoS requirements is the prevention of cell congestion, which involves balancing the load to ensure sufficient radio resources are available for each cell to serve its designated User Equipments (UEs). In this work, a novel QoS-aware Load Balancing (LB) approach is developed to optimize the performance of Guaranteed Bit Rate (GBR) and Best Effort (BE) traffic in a multi-band Open Radio Access Network (O-RAN) under QoS and resource constraints. The proposed solution builds on Graph Reinforcement Learning (GRL), a powerful framework at the intersection of Graph Neural Network (GNN) and RL. The QoS-aware LB is modeled as a Markov Decision Process, with states represented as graphs. QoS consideration are integrated into both state representations and reward signal design. The LB agent is then trained using an off-policy dueling Deep Q Network (DQN) that leverages a GNN-based architecture. This design ensures the LB policy is invariant to the ordering of nodes (UE or cell), flexible in handling various network sizes, and capable of accounting for spatial node dependencies in LB decisions. Performance of the GRL-based solution is compared with two baseline methods. Results show substantial performance gains, including a $53\%$ reduction in QoS violations and a fourfold increase in the 5th percentile rate for BE traffic.

machine learning, reinforcement learning, traffic, (16 more...)

2504.19499

Genre: Research Report (1.00)

Industry:

Telecommunications (1.00)
Energy > Power Industry (0.64)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Hsu, Cyril Shih-Huan, Dalgkitsis, Anestis, Papagianni, Chrysa, Grosso, Paola

Transformer-Empowered Actor-Critic Reinforcement Learning for Sequence-Aware Service Function Chain Partitioning

In the forthcoming era of 6G networks, characterized by unprecedented data rates, ultra-low latency, and extensive connectivity, effective management of Virtualized Network Functions (VNFs) is essential. VNFs are software-based counterparts of traditional hardware devices that facilitate flexible and scalable service provisioning. Service Function Chains (SFCs), structured as ordered sequences of VNFs, are pivotal in orchestrating complex network services. Nevertheless, partitioning SFCs across multi-domain network infrastructures presents substantial challenges due to stringent latency constraints and limited resource availability. Conventional optimization-based methods typically exhibit low scalability, whereas existing data-driven approaches often fail to adequately balance computational efficiency with the capability to effectively account for dependencies inherent in SFCs. To overcome these limitations, we introduce a Transformer-empowered actor-critic framework specifically designed for sequence-aware SFC partitioning. By utilizing the self-attention mechanism, our approach effectively models complex inter-dependencies among VNFs, facilitating coordinated and parallelized decision-making processes. Additionally, we enhance training stability and convergence using $ε$-LoPe exploration strategy as well as Asymptotic Return Normalization. Comprehensive simulation results demonstrate that the proposed methodology outperforms existing state-of-the-art solutions in terms of long-term acceptance rates, resource utilization efficiency, and scalability, while achieving rapid inference. This study not only advances intelligent network orchestration by delivering a scalable and robust solution for SFC partitioning within emerging 6G environments, but also bridging recent advancements in Large Language Models (LLMs) with the optimization of next-generation networks.

large language model, machine learning, reinforcement learning, (24 more...)

2504.18902

Country: Europe > Netherlands (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment (0.68)
Telecommunications > Networks (0.46)
Information Technology > Networks (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Simeone, Osvaldo, Park, Sangwoo, Zecchin, Matteo

Conformal Calibration: Ensuring the Reliability of Black-Box AI in Wireless Systems

AI is poised to revolutionize telecommunication networks by boosting efficiency, automation, and decision-making. However, the black-box nature of most AI models introduces substantial risk, possibly deterring adoption by network operators. These risks are not addressed by the current prevailing deployment strategy, which typically follows a best-effort train-and-deploy paradigm. This paper reviews conformal calibration, a general framework that moves beyond the state of the art by adopting computationally lightweight, advanced statistical tools that offer formal reliability guarantees without requiring further training or fine-tuning. Conformal calibration encompasses pre-deployment calibration via uncertainty quantification or hyperparameter selection; online monitoring to detect and mitigate failures in real time; and counterfactual post-deployment performance analysis to address "what if" diagnostic questions after deployment. By weaving conformal calibration into the AI model lifecycle, network operators can establish confidence in black-box AI models as a dependable enabling technology for wireless systems. A. Motivation Next-generation wireless networks are expected to leverage AI for tasks ranging from physical-layer processing to resource management. Initiatives like O-RAN exemplify this trend by defining open network architectures that enable data-driven control at different time scales via modular AI applications [1]. While AI promises improved efficiency and flexibility, most AI apps function as black boxes, raising significant reliability concerns. These reliability concerns may make operators hesitant to cede network functionalities to black-box systems without additional safeguards.

artificial intelligence, conformal prediction, prediction, (18 more...)

2504.0931

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Telecommunications (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence (1.00)

Akmeemana, Lahiru, Attanayake, Chamodya, Faiz, Husni, Wickramanayake, Sandareka

GAL-MAD: Towards Explainable Anomaly Detection in Microservice Applications Using Graph Attention Networks

The transition to microservices has revolutionized software architectures, offering enhanced scalability and modularity. However, the distributed and dynamic nature of microservices introduces complexities in ensuring system reliability, making anomaly detection crucial for maintaining performance and functionality. Anomalies stemming from network and performance issues must be swiftly identified and addressed. Existing anomaly detection techniques often rely on statistical models or machine learning methods that struggle with the high-dimensional, interdependent data inherent in microservice applications. Current techniques and available datasets predominantly focus on system traces and logs, limiting their ability to support advanced detection models. This paper addresses these gaps by introducing the RS-Anomic dataset generated using the open-source RobotShop microservice application. The dataset captures multivariate performance metrics and response times under normal and anomalous conditions, encompassing ten types of anomalies. We propose a novel anomaly detection model called Graph Attention and LSTM-based Microservice Anomaly Detection (GAL-MAD), leveraging Graph Attention and Long Short-Term Memory architectures to capture spatial and temporal dependencies in microservices. We utilize SHAP values to localize anomalous services and identify root causes to enhance explainability. Experimental results demonstrate that GAL-MAD outperforms state-of-the-art models on the RS-Anomic dataset, achieving higher accuracy and recall across varying anomaly rates. The explanations provide actionable insights into service anomalies, which benefits system administrators.

anomaly, data mining, machine learning, (16 more...)

2504.00058

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Telecommunications (0.69)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-28-2025

Boxi: Design Decisions in the Context of Algorithmic Performance for Robotics

Frey, Jonas, Tuna, Turcan, Fu, Lanke Frank Tarimo, Weibel, Cedric, Patterson, Katharine, Krummenacher, Benjamin, Müller, Matthias, Nubert, Julian, Fallon, Maurice, Cadena, Cesar, Hutter, Marco

Achieving robust autonomy in mobile robots operating in complex and unstructured environments requires a multimodal sensor suite capable of capturing diverse and complementary information. However, designing such a sensor suite involves multiple critical design decisions, such as sensor selection, component placement, thermal and power limitations, compute requirements, networking, synchronization, and calibration. While the importance of these key aspects is widely recognized, they are often overlooked in academia or retained as proprietary knowledge within large corporations. To improve this situation, we present Boxi, a tightly integrated sensor payload that enables robust autonomy of robots in the wild. This paper discusses the impact of payload design decisions made to optimize algorithmic performance for downstream tasks, specifically focusing on state estimation and mapping. Boxi is equipped with a variety of sensors: two LiDARs, 10 RGB cameras including high-dynamic range, global shutter, and rolling shutter models, an RGB-D camera, 7 inertial measurement units (IMUs) of varying precision, and a dual antenna RTK GNSS system. Our analysis shows that time synchronization, calibration, and sensor modality have a crucial impact on the state estimation performance. We frame this analysis in the context of cost considerations and environment-specific challenges. We also present a mobile sensor suite `cookbook` to serve as a comprehensive guideline, highlighting generalizable key design considerations and lessons learned during the development of Boxi. Finally, we demonstrate the versatility of Boxi being used in a variety of applications in real-world scenarios, contributing to robust autonomy. More details and code: https://github.com/leggedrobotics/grand_tour_box

artificial intelligence, calibration, dataset, (17 more...)

2504.185

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Transportation > Ground (0.67)
Telecommunications (0.67)
Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.66)