Goto

Collaborating Authors

 buffer layer


Buffer layers for Test-Time Adaptation

Kim, Hyeongyu, Han, Geonhui, Hwang, Dosik

arXiv.org Artificial Intelligence

In recent advancements in Test Time Adaptation (TTA), most existing methodologies focus on updating normalization layers to adapt to the test domain. However, the reliance on normalization-based adaptation presents key challenges. First, normalization layers such as Batch Normalization (BN) are highly sensitive to small batch sizes, leading to unstable and inaccurate statistics. Moreover, normalization-based adaptation is inherently constrained by the structure of the pre-trained model, as it relies on training-time statistics that may not generalize well to unseen domains. These issues limit the effectiveness of normalization-based TTA approaches, especially under significant domain shift. In this paper, we introduce a novel paradigm based on the concept of a Buffer layer, which addresses the fundamental limitations of normalization layer updates. Unlike existing methods that modify the core parameters of the model, our approach preserves the integrity of the pre-trained backbone, inherently mitigating the risk of catastrophic forgetting during online adaptation. Through comprehensive experimentation, we demonstrate that our approach not only outperforms traditional methods in mitigating domain shift and enhancing model robustness, but also exhibits strong resilience to forgetting. Furthermore, our Buffer layer is modular and can be seamlessly integrated into nearly all existing TTA frameworks, resulting in consistent performance improvements across various architectures. These findings validate the effectiveness and versatility of the proposed solution in real-world domain adaptation scenarios. The code is available at https://github.com/hyeongyu-kim/Buffer_TTA.


Buffer is All You Need: Defending Federated Learning against Backdoor Attacks under Non-iids via Buffering

Lyu, Xingyu, Wang, Ning, Xiao, Yang, Li, Shixiong, Li, Tao, Chen, Danjue, Chen, Yimin

arXiv.org Artificial Intelligence

Buffer is All Y ou Need: Defending Federated Learning against Backdoor Attacks under Non-iids via Buffering Xingyu Lyu, Ning Wang, Y ang Xiao, Shixiong Li, Tao Li, Danjue Chen, Yimin Chen Miner School of Computer and Information Sciences, University of Massachusetts Lowell, USA, Department of Computer Science and Engineering, University of South Florida, USA, Department of Computer Science, University of Kentucky, Department of Computer and Information Technology, Purdue University, USA, Department of Civil, Construction, and Environmental Engineering, North Carolina State University, USA {xingyu_lyu, shixiong_li, ian_chen}@uml.edu, Abstract --Federated Learning (FL) is a popular paradigm enabling clients to jointly train a global model without sharing raw data. However, FL is known to be vulnerable towards backdoor attacks due to its distributed nature. Here we propose FLBuff for tackling backdoor attacks even under non-iids. The main challenge for such defenses is that non-iids bring benign and malicious updates closer, hence harder to separate. FLBuff is inspired by our insight that non-iids can be modeled as omni-directional expansion in representation space while backdoor attacks as uni-directional. Comprehensive evaluations demonstrate that FLBuff consistently outperforms state-of-the-art defenses.


Mitigating System Bias in Resource Constrained Asynchronous Federated Learning Systems

Gao, Jikun, Mavromatis, Ioannis, Li, Peizheng, Carnelli, Pietro, Khan, Aftab

arXiv.org Artificial Intelligence

Federated learning (FL) systems face performance challenges in dealing with heterogeneous devices and non-identically distributed data across clients. We propose a dynamic global model aggregation method within Asynchronous Federated Learning (AFL) deployments to address these issues. Our aggregation method scores and adjusts the weighting of client model updates based on their upload frequency to accommodate differences in device capabilities. Additionally, we also immediately provide an updated global model to clients after they upload their local models to reduce idle time and improve training efficiency. We evaluate our approach within an AFL deployment consisting of 10 simulated clients with heterogeneous compute constraints and non-IID data. The simulation results, using the FashionMNIST dataset, demonstrate over 10% and 19% improvement in global model accuracy compared to state-of-the-art methods PAPAYA and FedAsync, respectively. Our dynamic aggregation method allows reliable global model training despite limiting client resources and statistical data heterogeneity. This improves robustness and scalability for real-world FL deployments.