FedMMKT:Co-Enhancing a Server Text-to-Image Model and Client Task Models in Multi-Modal Federated Learning

He, Ningxin, Liu, Yang, Sun, Wei, Ye, Xiaozhou, Ouyang, Ye, Gao, Tiegang, Zhang, Zehui

Oct-15-2025–arXiv.org Artificial Intelligence

Abstract--T ext-to-Image (T2I) models have demonstrated their versatility in a wide range of applications. However, adaptation of T2I models to specialized tasks is often limited by the availability of task-specific data due to privacy concerns. On the other hand, harnessing the power of rich multimodal data from modern mobile systems and IoT infrastructures presents a great opportunity. EXT -to-Image (T2I) models such as GLIDE [1], DALL-E-2 [2], and Stable Diffusion [3] have seen rapid development across various application domains. Recent work in multimodal FL explores the integration of diverse modalities from decentralized clients to train a global multimodal model [25]-[27].

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

Oct-15-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.68)
- Europe (0.46)

Genre:
- Research Report (0.82)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found