Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling
–Neural Information Processing Systems
As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device data to the cloud server, which can be infeasible in many real-world applications due to the often sensitive nature of the collected data and the limited communication bandwidth. To tackle these challenges, we propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources (e.g., Internet images). We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training, in lieu of client data. ECOS probes open-source data on the cloud server to sense the distribution of client data via a communication- and computation-efficient sampling process, which only communicates a few compressed public features and client scalar responses.
Neural Information Processing Systems
Jan-15-2025, 16:35:31 GMT
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning (0.43)
- Cloud Computing (1.00)
- Software (1.00)
- Information Technology