Agentic AI-Empowered Conversational Embodied Intelligence Networks in 6G

Chen, Mingkai, Feng, Zijie, Wang, Lei, Khamayseh, Yaser

arXiv.org Artificial Intelligence 

Abstract--In the 6G era, semantic collaboration among multiple embodied intelligent devices (MEIDs) is becoming a key capability for complex task execution. However, existing systems remain some challenges on multimodal information fusion, adaptive communication, and decision interpretability, enabling efficient collaboration in dynamic environment. T o address this, we propose a Collaborative Conversational Embodied Intelligence Network (CC-EIN) framework that integrates multimodal feature fusion, adaptive semantic communication, task coordination, and interpretability. Second, an adaptive semantic communication strategy dynamically adjusts coding schemes, compression ratios, and transmission power according to the urgency of the task and the channel conditions, thus improving spectrum efficiency under bandwidth constraints. Third, a semantic-driven collaboration mechanism decomposes and allocates tasks through a shared knowledge base, enabling drones, autonomous vehicles, and robot dogs to cooperate effectively while avoiding conflicts. Finally, decision visualization using Gradient-weighted Class Activation Mapping (Grad-CAM) highlights agents' focus areas during decision-making, enhancing transparency and trust. Simulations show that the proposed framework achieves a 95.4% task completion rate (TCR) and 95% transmission efficiency (TE) in post-earthquake rescue scenarios, while showing significant advantages in semantic consistency (SC) and energy-adaptive performance. Index T erms--semantic collaboration, embodied intelligent devices, adaptive communication, multimodal feature fusion, interpretability.