Synesthesia of Machines (SoM)-Based Task-Driven MIMO System for Image Transmission
Li, Sijiang, Zhang, Rongqing, Cheng, Xiang, Tang, Jian
–arXiv.org Artificial Intelligence
--T o support cooperative perception (CP) of networked mobile agents in dynamic scenarios, the efficient and robust transmission of sensory data is a critical challenge. Deep learning-based joint source-channel coding (JSCC) has demonstrated promising results for image transmission under adverse channel conditions, outperforming traditional rule-based codecs. While recent works have explored to combine JSCC with the widely adopted multiple-input multiple-output (MIMO) technology, these approaches are still limited to the discrete-time analog transmission (DT A T) model and simple tasks. Given the limited performance of existing MIMO JSCC schemes in supporting complex CP tasks for networked mobile agents with digital MIMO communication systems, this paper presents a Synesthesia of Machines (SoM)-based task-driven MIMO system for image transmission, referred to as SoM-MIMO. By leveraging the structural properties of the feature pyramid for perceptual tasks and the channel properties of the closed-loop MIMO communication system, SoM-MIMO enables efficient and robust digital MIMO transmission of images. Experimental results have shown that compared with two JSCC baseline schemes, our approach achieves average mAP improvements of 6.30 and 10.48 across all SNR levels, while maintaining identical communication overhead. N the era of beyond fifth generation (B5G) and sixth generation (6G), a large number of mobile agents, including autonomous vehicles, unmanned aerial vehicles, and humanoid robots, etc., will interact in real-time and execute diverse intelligent functions, revolutionizing industries and daily life. To enable diverse intelligent functionalities, such as decision-making and task execution, accurate environmental perception--encompassing the acquisition of object position, size, and category--is essential. Manuscript received 24 April 2025; revised 20 July 2025; accepted 26 August 2025. This work was supported in part by the by the National Natural Science Foundation of China under Grant 62125101, Grant 62341101, and Grant 62271351; in part by the New Cornerstone Science Foundation through the XPLORER PRIZE. Rongqing Zhang is with Intelligent Transportation Thrust, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China (email: rongqingz@tongji.edu.cn).
arXiv.org Artificial Intelligence
Sep-3-2025
- Country:
- Asia
- China
- Beijing > Beijing (0.04)
- Guangdong Province > Guangzhou (0.44)
- Hong Kong (0.24)
- Middle East > UAE
- Dubai Emirate > Dubai (0.04)
- Singapore > Central Region
- Singapore (0.04)
- South Korea > Seoul
- Seoul (0.04)
- China
- Europe
- Greece (0.04)
- Italy > Tuscany
- Florence (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- United Kingdom > Scotland
- City of Glasgow > Glasgow (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Hawaii > Honolulu County
- Honolulu (0.04)
- Rhode Island (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Hawaii > Honolulu County
- Canada > Ontario
- Asia
- Genre:
- Research Report (0.64)
- Industry:
- Energy (0.88)
- Information Technology > Robotics & Automation (0.34)
- Technology: