Mitigating Modality Quantity and Quality Imbalance in Multimodal Online Federated Learning

Wang, Heqiang, Yang, Weihong, Zhong, Xiaoxiong, Zhou, Jia, Liu, Fangming, Zhang, Weizhe

arXiv.org Artificial Intelligence 

--The Internet of Things (IoT) ecosystem produces massive volumes of multimodal data from diverse sources, including sensors, cameras, and microphones. With advances in edge intelligence, IoT devices have evolved from simple data acquisition units into computationally capable nodes, enabling localized processing of heterogeneous multimodal data. This evolution necessitates distributed learning paradigms that can efficiently handle such data. Furthermore, the continuous nature of data generation and the limited storage capacity of edge devices demand an online learning framework. Multimodal Online Federated Learning (MMO-FL) has emerged as a promising approach to meet these requirements. However, MMO-FL faces new challenges due to the inherent instability of IoT devices, which often results in modality quantity and quality imbalance (QQI) during data collection. In this work, we systematically investigate the impact of QQI within the MMO-FL framework and present a comprehensive theoretical analysis quantifying how both types of imbalance degrade learning performance. T o address these challenges, we propose the Modality Quantity and Quality Rebalanced (QQR) algorithm, a prototype learning based method designed to operate in parallel with the training process. Extensive experiments on two real-world multimodal datasets show that the proposed QQR algorithm consistently outperforms benchmarks under modality imbalance conditions with promising learning performance. The rapid growth of the Internet of Things (IoT) [1] has resulted in an extraordinary increase in data generated by diverse interconnected devices, such as smart home systems [2], wearable health trackers [3], and industrial sensors [4]. To enable intelligent applications and services within this ecosystem, artificial intelligence, particularly machine learning and deep learning, has become an essential approach for building models from large-scale IoT data. Traditionally, model training has been conducted on centralized cloud platforms or in data centers. However, as both the volume of IoT data and the number of connected devices continue to rise, this centralized paradigm encounters scalability and efficiency bottlenecks. H. Wang, W . Y ang, X. Zhong, J. Zhou, F. Liu and W . Zhang are with Peng Cheng Laboratory, Shenzhen, 518066, China.