Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration
Ji, Wei, Li, Li, Lv, Zheqi, Zhang, Wenqiao, Li, Mengze, Wan, Zhen, Lei, Wenqiang, Zimmermann, Roger
–arXiv.org Artificial Intelligence
These devices serve as data collection powerhouses, continuously amassing vast repositories of personalized multi-modal data, which can include a wide array of input modalities such as text, images and videos. The potential locked within this trove of multi-modal data arriving continuously is immense, promising to unlock high-quality and tailored device-aware services for individual users. Despite promising, the personalized device service involves analyzing the dynamic nature of the multi-modal data that underscore users' intentions. The prevailing artificial intelligence (AI) systems, primarily trained and deployed in cloud-based environments, face a profound challenge in adapting to the dynamic device data when using a static cloud model for all individual users, mainly due to the distribution shift of the cloud and device data, as shown in Figure 1. In other words, high-quality personalized service requires AI systems to undergo continual refinement and adaptation to accommodate the evolving landscape of personalized multi-modal data. Intuitively, one of the straightforward adaptation strategies is to fine-tune the cloud model based on the device's multi-modal data, which can kindly alleviate the cloud-device data distribution shift to model users' intentions. Nevertheless, we contend that the fine-tuning-adaptation (FTA) paradigm may not satisfactorily resolve device model personalization, which can be summarized as two key aspects: (1) Undesirable Annotation.
arXiv.org Artificial Intelligence
May-21-2024
- Genre:
- Research Report > Promising Solution (0.68)
- Industry:
- Information Technology > Services (0.48)
- Technology: