mfl
Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
Gu, Shangding, Ying, Donghao, Jin, Ming, Lu, Yu Joe, Wang, Jun, Lavaei, Javad, Spanos, Costas
We introduce Model Feedback Learning (MFL), a novel test-time optimization framework for optimizing inputs to pre-trained AI models or deployed hardware systems without requiring any retraining of the models or modifications to the hardware. In contrast to existing methods that rely on adjusting model parameters, MFL leverages a lightweight reverse model to iteratively search for optimal inputs, enabling efficient adaptation to new objectives under deployment constraints. This framework is particularly advantageous in real-world settings, such as semiconductor manufacturing recipe generation, where modifying deployed systems is often infeasible or cost-prohibitive. We validate MFL on semiconductor plasma etching tasks, where it achieves target recipe generation in just five iterations, significantly outperforming both Bayesian optimization and human experts. Beyond semiconductor applications, MFL also demonstrates strong performance in chemical processes (e.g., chemical vapor deposition) and electronic systems (e.g., wire bonding), highlighting its broad applicability. Additionally, MFL incorporates stability-aware optimization, enhancing robustness to process variations and surpassing conventional supervised learning and random search methods in high-dimensional control settings. By enabling few-shot adaptation, MFL provides a scalable and efficient paradigm for deploying intelligent control in real-world environments.
Encoding architecture algebra
Bersier, Stephane, Chen-Lin, Xinyi
There is growing awareness of the importance of designing model architectures that capture and respect the distinct structure of input data. Many successful deep learning architectures, 2 such as transformers [1], convolutional neural networks (CNNs)[2], graph neural networks (GNNs) [3], and recurrent neural networks (RNNs)[4], inherently incorporate aspects of data structure. Ongoing research focuses on refining existing architectures, as well as designing new ones for other types of structured data. For instance, DeepSets [5] are tailored to process sets, group and gauge equivariant CNNs [6][7] respect both global and local symmetries in the data, and strongly-typed RNNs [8] incorporate explicit types within recurrent networks. By accounting for the structure of the input data, these model architectures exhibit improved performance, better generalization with fewer parameters, and enhanced interpretability.
Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality
Le, Huy Q., Thwal, Chu Myaet, Qiao, Yu, Tun, Ye Lin, Nguyen, Minh N. H., Hong, Choong Seon
Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a machine learning model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, significantly impacting the performance of global model. The absence of a modality introduces misalignment during the local training phase, stemming from zero-filling in the case of clients with missing modalities. Consequently, achieving robust generalization in global model becomes imperative, especially when dealing with clients that have incomplete data. In this paper, we propose Multimodal Federated Cross Prototype Learning (MFCPL), a novel approach for MFL under severely missing modalities by conducting the complete prototypes to provide diverse modality knowledge in modality-shared level with the cross-modal regularization and modality-specific level with cross-modal contrastive mechanism. Additionally, our approach introduces the cross-modal alignment to provide regularization for modality-specific features, thereby enhancing overall performance, particularly in scenarios involving severely missing modalities. Through extensive experiments on three multimodal datasets, we demonstrate the effectiveness of MFCPL in mitigating these challenges and improving the overall performance.
Client-wise Modality Selection for Balanced Multi-modal Federated Learning
Fan, Yunfeng, Xu, Wenchao, Wang, Haozhao, Ruan, Penghui, Guo, Song
Selecting proper clients to participate in the iterative federated learning (FL) rounds is critical to effectively harness a broad range of distributed datasets. Existing client selection methods simply consider the variability among FL clients with uni-modal data, however, have yet to consider clients with multi-modalities. We reveal that traditional client selection scheme in MFL may suffer from a severe modality-level bias, which impedes the collaborative exploitation of multi-modal data, leading to insufficient local data exploration and global aggregation. To tackle this challenge, we propose a Client-wise Modality Selection scheme for MFL (CMSFed) that can comprehensively utilize information from each modality via avoiding such client selection bias caused by modality imbalance. Specifically, in each MFL round, the local data from different modalities are selectively employed to participate in local training and aggregation to mitigate potential modality imbalance of the global model. To approximate the fully aggregated model update in a balanced way, we introduce a novel local training loss function to enhance the weak modality and align the divergent feature spaces caused by inconsistent modality adoption strategies for different clients simultaneously. Then, a modality-level gradient decoupling method is designed to derive respective submodular functions to maintain the gradient diversity during the selection progress and balance MFL according to local modality imbalance in each iteration. Our extensive experiments showcase the superiority of CMSFed over baselines and its effectiveness in multi-modal data exploitation.
A large-scale particle system with independent jumps and distributed synchronization
Baryshnikov, Yuliy, Stolyar, Alexander
We study a system consisting of $n$ particles, moving forward in jumps on the real line. Each particle can make both independent jumps, whose sizes have some distribution, or ``synchronization'' jumps, which allow it to join a randomly chosen other particle if the latter happens to be ahead of it. The mean-field asymptotic regime, where $n\to\infty$, is considered. As $n\to\infty$, we prove the convergence of the system dynamics to that of a deterministic mean-field limit (MFL). We obtain results on the average speed of advance of a ``benchmark'' MFL (BMFL) and the liminf of the steady-state speed of advance, in terms of MFLs that are traveling waves. For the special case of exponentially distributed independent jump sizes, we prove that a traveling wave MFL with speed $v$ exists if and only if $v\ge v_*$, with $v_*$ having simple explicit form; this allows us to show that the average speed of the BMFL is equal to $v_*$ and the liminf of the steady-state speeds is lower bounded by $v_*$. Finally, we put forward a conjecture that both the average speed of the BMFL and the exact limit of the steady-state speeds, under general distribution of an independent jump size, are equal to number $v_{**}$, which is easily found from a ``minimum speed principle.'' This general conjecture is consistent with our results for the exponentially distributed jumps and is confirmed by simulations.
Metrizing Fairness
Rychener, Yves, Taskesen, Bahar, Kuhn, Daniel
We study supervised learning problems for predicting properties of individuals who belong to one of two demographic groups, and we seek predictors that are fair according to statistical parity. This means that the distributions of the predictions within the two groups should be close with respect to the Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators if unfairness is measured by the squared $\mathcal L^2$-distance or by a squared maximum mean discrepancy. In this case, the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs -- sometimes orders of magnitude faster. Finally, we identify conditions under which statistical parity can improve prediction accuracy.
Accelerating Federated Learning via Momentum Gradient Descent
Liu, Wei, Chen, Li, Chen, Yunfei, Zhang, Wenyi
Federated learning (FL) provides a communication-efficient approach to solve machine learning problems concerning distributed data, without sending raw data to a central server. However, existing works on FL only utilize first-order gradient descent (GD) and do not consider the preceding iterations to gradient update which can potentially accelerate convergence. In this paper, we consider momentum term which relates to the last iteration. The proposed momentum federated learning (MFL) uses momentum gradient descent (MGD) in the local update step of FL system. We establish global convergence properties of MFL and derive an upper bound on MFL convergence rate. Comparing the upper bounds on MFL and FL convergence rate, we provide conditions in which MFL accelerates the convergence. For different machine learning models, the convergence performance of MFL is evaluated based on experiments with MNIST dataset. Simulation results comfirm that MFL is globally convergent and further reveal significant convergence improvement over FL.
'Mutant Football League' Surpasses Kickstarter Goal On Super Bowl LI Sunday
The NFL did not just succeed in uniting football fans all over the country for its Super Bowl LI event and halftime show this Sunday, it may have also inspired more players to back Digital Dreams Entertainment's Kickstarter campaign for "Mutant Football League," a sporty video game that is more of a pandemonium than a sporting competition. Apparently, in time with Sunday's "Big Game," the crowdfunding project achieved its financial goal to green light "MFL's" release on PC, Xbox One and PS4. On Sunday, while everyone in America was glued to their television screens for Super Bowl LI, "Mutant Football League" creator Michael Mendheim delivered the news about the milestone on the game's Kickstarter page. "'MFL' makes goal on Super Bowl Sunday!!! We did it!!!!" Mendheim wrote before thanking Kickstarter and the MFL community or the backers that are giving the "wildest, goriest, most outrageous football game ever" a chance.