communication task
Semantic-Driven AI Agent Communications: Challenges and Solutions
Yu, Kaiwen, Sun, Mengying, Qin, Zhijin, Xu, Xiaodong, Yang, Ping, Xiao, Yue, Wu, Gang
With the rapid growth of intelligent services, communication targets are shifting from humans to artificial intelligent (AI) agents, which require new paradigms to enable real-time perception, decision-making, and collaboration. Semantic communication, which conveys task-relevant meaning rather than raw data, offers a promising solution. However, its practical deployment remains constrained by dynamic environments and limited resources. To address these issues, this article proposes a semantic-driven AI agent communication framework and develops three enabling techniques. First, semantic adaptation transmission applies fine-tuning with real or generative samples to efficiently adapt models to varying environments. Second, semantic lightweight transmission incorporates pruning, quantization, and perception-aware sampling to reduce model complexity and alleviate computational burden on edge agents. Third, semantic self-evolution control employs distributed hierarchical decision-making to optimize multi-dimensional resources, enabling robust multi-agent collaboration in dynamic environments. Simulation results show that the proposed solutions achieve faster convergence and stronger robustness, while the proposed distributed hierarchical optimization method significantly outperforms conventional decision-making schemes, highlighting its potential for AI agent communication networks.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Research Report > New Finding (0.67)
- Research Report > Promising Solution (0.48)
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications
Jiang, Feibo, Pan, Cunhua, Dong, Li, Wang, Kezhi, Dobre, Octavia A., Debbah, Merouane
With the advent of 6G communications, intelligent communication systems face multiple challenges, including constrained perception and response capabilities, limited scalability, and low adaptability in dynamic environments. This tutorial provides a systematic introduction to the principles, design, and applications of Large Artificial Intelligence Models (LAMs) and Agentic AI technologies in intelligent communication systems, aiming to offer researchers a comprehensive overview of cutting-edge technologies and practical guidance. First, we outline the background of 6G communications, review the technological evolution from LAMs to Agentic AI, and clarify the tutorial's motivation and main contributions. Subsequently, we present a comprehensive review of the key components required for constructing LAMs. We further categorize LAMs and analyze their applicability, covering Large Language Models (LLMs), Large Vision Models (LVMs), Large Multimodal Models (LMMs), Large Reasoning Models (LRMs), and lightweight LAMs. Next, we propose a LAM-centric design paradigm tailored for communications, encompassing dataset construction and both internal and external learning approaches. Building upon this, we develop an LAM-based Agentic AI system for intelligent communications, clarifying its core components such as planners, knowledge bases, tools, and memory modules, as well as its interaction mechanisms. We also introduce a multi-agent framework with data retrieval, collaborative planning, and reflective evaluation for 6G. Subsequently, we provide a detailed overview of the applications of LAMs and Agentic AI in communication scenarios. Finally, we summarize the research challenges and future directions in current studies, aiming to support the development of efficient, secure, and sustainable next-generation intelligent communication systems.
- Asia > China > Hunan Province > Changsha (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- (12 more...)
- Overview (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Energy (1.00)
- (2 more...)
Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach
She, Ruifeng, Pang, Bowen, Li, Kai, Liu, Zehua, Zhong, Tao
As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and pipeline-have been successfully implemented for popular neural networks on main-stream hardware, optimizing the distributed deployment schedule requires extensive expertise and manual effort. Further more, while existing frameworks with most simple chain-like structures, they struggle with complex non-linear architectures. Mixture-of-experts and multi-modal models feature intricate MIMO and branch-rich topologies that require fine-grained operator-level parallelization beyond the capabilities of existing frameworks. We propose formulating parallelism planning as a scheduling optimization problem using mixed-integer programming. We propose a bi-level solution framework balancing optimality with computational efficiency, automatically generating effective distributed plans that capture both the heterogeneous structure of modern neural networks and the underlying hardware constraints. In experiments comparing against expert-designed strategies like DeepSeek's DualPipe, our framework achieves comparable or superior performance, reducing computational bubbles by half under the same memory constraints. The framework's versatility extends beyond throughput optimization to incorporate hardware utilization maximization, memory capacity constraints, and other considerations or potential strategies. Such capabilities position our solution as both a valuable research tool for exploring optimal parallelization strategies and a practical industrial solution for large-scale AI deployment.
Interview with Raffaele Galliera: Deep reinforcement learning for communication networks
The AAAI/SIGAI Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. This year, 30 students were selected for this programme, and we've been meeting them and talking about their research. In this interview, Raffaele Galliera, tells us about his work on deep reinforcement learning for communication networks. My name is Raffaele Galliera and I'm a PhD student in the Intelligent Systems and Robotics program at the University of West Florida, located in Pensacola. It's a joint program between the University of West Florida and the Institute for Human and Machine Cognition (IHMC), which is a nonprofit organization based in Pensacola.
- North America > United States > Florida > Escambia County > Pensacola (0.45)
- Europe > Italy (0.06)
- Education > Educational Setting > Higher Education (0.57)
- Media (0.48)
Communication Optimization for Distributed Training: Architecture, Advances, and Opportunities
Wei, Yunze, Hu, Tianshuo, Liang, Cong, Cui, Yong
The past few years have witnessed the flourishing of large-scale deep neural network models with ever-growing parameter numbers. Training such large-scale models typically requires massive memory and computing resources that exceed those of a single GPU, necessitating distributed training. As GPU performance has rapidly evolved in recent years, computation time has shrunk, thereby increasing the proportion of communication in the overall training time. Therefore, optimizing communication for distributed training has become an urgent issue. In this article, we briefly introduce the general architecture of distributed deep neural network training and analyze relationships among Parallelization Strategy, Collective Communication Library, and Network from the perspective of communication optimization, which forms a three-layer paradigm. We then review current representative research advances with this three-layer paradigm. We find that layers in the current three-layer paradigm are relatively independent, but there is a rich design space for cross-layer collaborative optimization in distributed training scenarios. Therefore, we further advocate a communication-efficient five-layer paradigm underlining opportunities for collaboration designs and look forward to the perspectives of "Vertical", "Horizontal", "Intra-Inter" and "Host-Net" collaboration designs. We hope this article can shed some light on future research on communication optimization for distributed training.
- Education (0.34)
- Information Technology (0.30)
DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
Zhang, Lin, Shi, Shaohuai, Chu, Xiaowen, Wang, Wei, Li, Bo, Liu, Chengjian
Communication scheduling has been shown to be effective in accelerating distributed training, which enables all-reduce communications to be overlapped with backpropagation computations. This has been commonly adopted in popular distributed deep learning frameworks. However, there exist two fundamental problems: (1) excessive startup latency proportional to the number of workers for each all-reduce operation; (2) it only achieves sub-optimal training performance due to the dependency and synchronization requirement of the feed-forward computation in the next iteration. We propose a novel scheduling algorithm, DeAR, that decouples the all-reduce primitive into two continuous operations, which overlaps with both backpropagation and feed-forward computations without extra communications. We further design a practical tensor fusion algorithm to improve the training performance. Experimental results with five popular models show that DeAR achieves up to 83% and 15% training speedup over the state-of-the-art solutions on a 64-GPU cluster with 10Gb/s Ethernet and 100Gb/s InfiniBand interconnects, respectively.
On Optimizing the Communication of Model Parallelism
Zhuang, Yonghao, Zhao, Hexu, Zheng, Lianmin, Li, Zhuohan, Xing, Eric P., Ho, Qirong, Gonzalez, Joseph E., Stoica, Ion, Zhang, Hao
We study a novel and important communication pattern in large-scale model-parallel deep learning (DL), which we call cross-mesh resharding. This pattern emerges when the two paradigms of model parallelism - intra-operator and inter-operator parallelism - are combined to support large models on large clusters. In cross-mesh resharding, a sharded tensor needs to be sent from a source device mesh to a destination device mesh, on which the tensor may be distributed with the same or different layouts. We formalize this as a many-to-many multicast communication problem, and show that existing approaches either are sub-optimal or do not generalize to different network topologies or tensor layouts, which result from different model architectures and parallelism strategies. We then propose two contributions to address cross-mesh resharding: an efficient broadcast-based communication system, and an "overlapping-friendly" pipeline schedule. On microbenchmarks, our overall system outperforms existing ones by up to 10x across various tensor and mesh layouts. On end-to-end training of two large models, GPT-3 and U-Transformer, we improve throughput by 10% and 50%, respectively.
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Communications > Networks (0.87)
Peters
When to send system-mediated interruptions within collaborative multi-human-machine environments has been widely debated in the development of interruption management systems. Unfortunately, these studies do not address when to send interruptions in multi-user, multitasking scenarios or predictors of interruptibility within communication tasks. This paper addresses the issue of predicting interruptibility within these interactions with special attention to which users are engaged in which tasks or task engagement and where users are within a current task or task structure as predictors of interruptibility. Using natural human speech from these interactions, we attempt to model task engagement and task structure to predict candidate points of interruptions. The motivation for these models and their performance in a multi-user, multitasking environment are discussed as proposals in developing communication interruption management systems. To model task structure, a task breakpoint model is proposed which performs with a 90% accuracy within a multi-user, multitasking dataset. Integrating this task breakpoint model into a real-time interaction results in an average accuracy of 93% using the proposed task breakpoint model and a rule-based model. To determine the current task in which users are engaged or task engagement, a proposed task topic model performs with an accuracy between 76-88% depending on the topic within the dataset.
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
Shi, Shaohuai, Zhang, Lin, Li, Bo
Distributed training with synchronous stochastic gradient descent (SGD) on GPU clusters has been widely used to accelerate the training process of deep models. However, SGD only utilizes the first-order gradient in model parameter updates, which may take days or weeks. Recent studies have successfully exploited approximate second-order information to speed up the training process, in which the Kronecker-Factored Approximate Curvature (KFAC) emerges as one of the most efficient approximation algorithms for training deep models. Yet, when leveraging GPU clusters to train models with distributed KFAC (D-KFAC), it incurs extensive computation as well as introduces extra communications during each iteration. In this work, we propose D-KFAC (SPD-KFAC) with smart parallelism of computing and communication tasks to reduce the iteration time. Specifically, 1) we first characterize the performance bottlenecks of D-KFAC, 2) we design and implement a pipelining mechanism for Kronecker factors computation and communication with dynamic tensor fusion, and 3) we develop a load balancing placement for inverting multiple matrices on GPU clusters. We conduct real-world experiments on a 64-GPU cluster with 100Gb/s InfiniBand interconnect. Experimental results show that our proposed SPD-KFAC training scheme can achieve 10%-35% improvement over state-of-the-art algorithms.
How Chatbots Are About To Change Communication
If you haven't heard of chatbots yet -- or your experience is limited to novelty programs like Cleverbot -- chances are you'll be seeing more of them in the coming years. Because companies are slowly starting to leverage chatbots as a way to manage basic communication tasks that used to belong solidly to the realm of human capabilities. In this piece, Hristo Borisov, the Director of Product Management at Progress, helps illuminate what chatbots are, how to build them, and their role in the future of business. In short, chatbots are robots programmed to respond like humans. According to Borisov's definition, "A chatbot is a computer program that is capable of having a human-like conversation with a user by receiving and sending text messages for the purpose of automating a business process."