online optimization algorithm
A Federated Fine-Tuning Paradigm of Foundation Models in Heterogenous Wireless Networks
Wang, Jingyi, Zhao, Zhongyuan, Wang, Qingtian, Li, Zexu, Wang, Yue, Quek, Tony Q. S.
Edge intelligence has emerged as a promising strategy to deliver low-latency and ubiquitous services for mobile devices. Recent advances in fine-tuning mechanisms of foundation models have enabled edge intelligence by integrating low-rank adaptation (LoRA) with federated learning. However, in wireless networks, the device heterogeneity and resource constraints on edge devices pose great threats to the performance of federated fine-tuning. To tackle these issues, we propose to optimize federated fine-tuning in heterogenous wireless networks via online learning. First, the framework of switching-based federated fine-tuning in wireless networks is provided. The edge devices switches to LoRA modules dynamically for federated fine-tuning with base station to jointly mitigate the impact of device heterogeneity and transmission unreliability. Second, a tractable upper bound on the inference risk gap is derived based on theoretical analysis. To improve the generalization capability, we formulate a non-convex mixed-integer programming problem with long-term constraints, and decouple it into model switching, transmit power control, and bandwidth allocation subproblems. An online optimization algorithm is developed to solve the problems with polynomial computational complexity. Finally, the simulation results on the SST-2 and QNLI data sets demonstrate the performance gains in test accuracy and energy efficiency.
- Asia > China > Beijing > Beijing (0.04)
- Asia > Singapore (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (2 more...)
Reviews: Acceleration through Optimistic No-Regret Dynamics
This paper shows that Nesterov's accelerated gradient descent algorithms can be interpreted as computing a saddle point via online optimization algorithms. A convex optimization problem is transformed to be a minmax problem by the Fenchel dual, the solution of which is then approximated via online optimization algorithms. This paper can be a significant contribution to the optimization community. I would say that this is one of the most natural interpretations of Nesterov's accelerated gradient methods. The use of weighted regrets and (optimistic) FollowTheLeader (instead of follow the regularized leader) are a little bit artificial but acceptable.