critical parameter
- North America > United States (0.14)
- Asia > China (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
FedPURIN: Programmed Update and Reduced INformation for Sparse Personalized Federated Learning
Xie, Lunchen, He, Zehua, Shi, Qingjiang
Personalized Federated Learning (PFL) has emerged as a critical research frontier addressing data heterogeneity issue across distributed clients. Novel model architectures and collaboration mechanisms are engineered to accommodate statistical disparities while producing client-specific models. Parameter decoupling represents a promising paradigm for maintaining model performance in PFL frameworks. However, the communication efficiency of many existing methods remains suboptimal, sustaining substantial communication burdens that impede practical deployment. To bridge this gap, we propose Federated Learning with Programmed Update and Reduced INformation (FedPURIN), a novel framework that strategically identifies critical parameters for transmission through an integer programming formulation. This mathematically grounded strategy is seamlessly integrated into a sparse aggregation scheme, achieving a significant communication reduction while preserving the efficacy. Comprehensive evaluations on standard image classification benchmarks under varied non-IID conditions demonstrate competitive performance relative to state-of-the-art methods, coupled with quantifiable communication reduction through sparse aggregation. The framework establishes a new paradigm for communication-efficient PFL, particularly advantageous for edge intelligence systems operating with heterogeneous data sources. Introduction Federated learning (FL), as a powerful distributed machine learning scheme, has been well studied to handle the growing trend towards harnessing abundant data on ubiquitous edge devices [1]. This framework has been successfully applied in various domains, including computer vision [2, 3], healthcare [4, 5], finance [6, 7], and ubiquitous IoT applications [8, 9, 10].
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > California > San Mateo County > San Mateo (0.04)
DataStealing: Steal Data from Diffusion Models in Federated Learning with Multiple Trojans
Parameters (AdaSCP) attack to circumvent the defenses and seamlessly incorporate malicious updates into the global model. Specifically, AdaSCP evaluates the importance of parameters with the gradients in dominant timesteps of the diffusion model. Subsequently, it adaptively seeks the optimal scale factor and magnifies critical parameter updates before uploading to the server. As a result, the malicious update becomes similar to the benign update, making it difficult for distance-based defenses to identify. Extensive experiments reveal the risk of leaking thousands of images in training diffusion models with FL.
- North America > United States (0.14)
- Asia > China (0.04)
Optimizing Force Signals from Human Demonstrations of In-Contact Motions
Hartwig, Johannes, Viessmann, Fabian, Henrich, Dominik
For non-robot-programming experts, kinesthetic guiding can be an intuitive input method, as robot programming of in-contact tasks is becoming more prominent. However, imprecise and noisy input signals from human demonstrations pose problems when reproducing motions directly or using the signal as input for machine learning methods. This paper explores optimizing force signals to correspond better to the human intention of the demonstrated signal. We compare different signal filtering methods and propose a peak detection method for dealing with first-contact deviations in the signal. The evaluation of these methods considers a specialized error criterion between the input and the human-intended signal. In addition, we analyze the critical parameters' influence on the filtering methods. The quality for an individual motion could be increased by up to \SI{20}{\percent} concerning the error criterion. The proposed contribution can improve the usability of robot programming and the interaction between humans and robots.
MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models
Zhang, Zhen, Yang, Yifan, Zhen, Kai, Susanj, Nathan, Mouchtaris, Athanasios, Kunzmann, Siegfried, Zhang, Zheng
Large language models have demonstrated exceptional capabilities across diverse tasks, but their fine-tuning demands significant memory, posing challenges for resource-constrained environments. Zeroth-order (ZO) optimization provides a memory-efficient alternative by eliminating the need for backpropagation. However, ZO optimization suffers from high gradient variance, and prior research has largely focused on single-task learning, leaving its application to multi-task learning unexplored. Multi-task learning is crucial for leveraging shared knowledge across tasks to improve generalization, yet it introduces unique challenges under ZO settings, such as amplified gradient variance and collinearity. In this paper, we present MaZO, the first framework specifically designed for multi-task LLM fine-tuning under ZO optimization. MaZO tackles these challenges at the parameter level through two key innovations: a weight importance metric to identify critical parameters and a multi-task weight update mask to selectively update these parameters, reducing the dimensionality of the parameter space and mitigating task conflicts. Experiments demonstrate that MaZO achieves state-of-the-art performance, surpassing even multi-task learning methods designed for first-order optimization.
- North America > United States > New York (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (2 more...)
No Data, No Optimization: A Lightweight Method To Disrupt Neural Networks With Sign-Flips
Galil, Ido, Kimhi, Moshe, El-Yaniv, Ran
Deep neural networks (DNNs) power a wide range of applications, including safety-critical tasks such as autonomous driving, unmanned aerial vehicle (UAV) navigation, medical diagnostics, and robotics, where real-time decision-making is essential. However, the increasing reliance on DNNs also raises concerns about their resilience to malicious attacks. Ensuring the robustness of DNNs is crucial to maintaining their reliability in such critical applications. In this paper, we expose a critical vulnerability in DNNs that allows for severe disruption by flipping as few as one to ten sign bits, a tiny fraction of the model's parameters. Our method demonstrates how a small number of bit flips, within models containing up to hundred millions of parameters, can cause catastrophic degradation in performance. We systematically analyze and identify the parameters most susceptible to sign flips, which we term "critical parameters."
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs through Bit-Flip Attacks
Das, Sanjay, Bhattacharya, Swastik, Kundu, Souvik, Kundu, Shamik, Menon, Anand, Raha, Arnab, Basu, Kanad
Large Language Models (LLMs) have revolutionized natural language processing (NLP), excelling in tasks like text generation and summarization. However, their increasing adoption in mission-critical applications raises concerns about hardware-based threats, particularly bit-flip attacks (BFAs). BFAs, enabled by fault injection methods such as Rowhammer, target model parameters in memory, compromising both integrity and performance. Identifying critical parameters for BFAs in the vast parameter space of LLMs poses significant challenges. While prior research suggests transformer-based architectures are inherently more robust to BFAs compared to traditional deep neural networks, we challenge this assumption. For the first time, we demonstrate that as few as three bit-flips can cause catastrophic performance degradation in an LLM with billions of parameters. Current BFA techniques are inadequate for exploiting this vulnerability due to the difficulty of efficiently identifying critical parameters within the immense parameter space. To address this, we propose AttentionBreaker, a novel framework tailored for LLMs that enables efficient traversal of the parameter space to identify critical parameters. Additionally, we introduce GenBFA, an evolutionary optimization strategy designed to refine the search further, isolating the most critical bits for an efficient and effective attack. Empirical results reveal the profound vulnerability of LLMs to AttentionBreaker. For example, merely three bit-flips (4.129 x 10^-9% of total parameters) in the LLaMA3-8B-Instruct 8-bit quantized (W8) model result in a complete performance collapse: accuracy on MMLU tasks drops from 67.3% to 0%, and Wikitext perplexity skyrockets from 12.6 to 4.72 x 10^5. These findings underscore the effectiveness of AttentionBreaker in uncovering and exploiting critical vulnerabilities within LLM architectures.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- North America > United States > Texas (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Choosing the parameter of the Fermat distance: navigating geometry and noise
Chazal, Frédéric, Ferraris, Laure, Groisman, Pablo, Jonckheere, Matthieu, Pascal, Frédéric, Sapienza, Facundo
The Fermat distance has been recently established as a useful tool for machine learning tasks when a natural distance is not directly available to the practitioner or to improve the results given by Euclidean distances by exploding the geometrical and statistical properties of the dataset. This distance depends on a parameter $\alpha$ that greatly impacts the performance of subsequent tasks. Ideally, the value of $\alpha$ should be large enough to navigate the geometric intricacies inherent to the problem. At the same, it should remain restrained enough to sidestep any deleterious ramifications stemming from noise during the process of distance estimation. We study both theoretically and through simulations how to select this parameter.
- North America > United States > California > Alameda County > Berkeley (0.14)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (2 more...)
Safety Evaluation of Robot Systems via Uncertainty Quantification
Baek, Woo-Jeong, Kröger, Torsten
In this paper, we present an approach for quantifying the propagated uncertainty of robot systems in an online and data-driven manner. Especially in Human-Robot Collaboration, keeping track of the safety compliance during run time is essential: Misclassifying dangerous situations as safe might result in severe accidents. According to official regulations (eg, ISO standards), safety in industrial robot applications depends on critical parameters, such as the distance and relative velocity between humans and robots. However, safety can only be assured given a measure for the reliability of these parameters. While different risk detection and mitigation approaches exist in literature, a measure that can be used to evaluate safety limits online, and succinctly implies whether a situation is safe or dangerous, is missing to date. Motivated by this, we introduce a generalizable method for calculating the propagated measurement uncertainty of arbitrary parameters, that captures the accumulated uncertainty originating from sensory devices and environmental disturbances of the system. To show that our approach delivers correct results, we perform validation experiments in simulation. In addition, we employ our method in two real-world settings and demonstrate how quantifying the propagated uncertainty of critical parameters facilitates assessing safety online in Human-Robot Collaboration.
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
ECSAS: Exploring Critical Scenarios from Action Sequence in Autonomous Driving
Kang, Shuting, Guo, Heng, Zhang, Lijun, Liu, Guangzhen, Xue, Yunzhi, Wu, Yanjun
Critical scenario generation requires the ability of sampling critical combinations from the infinite parameter space in the logic scenario. Existing solutions aim to explore the correlation of action parameters in the initial scenario rather than action sequences. How to model action sequences so that one can further consider the effects of different action parameters in the scenario is the bottleneck of the problem. In this paper, we attack the problem by proposing the ECSAS framework. Specifically, we first propose a description language, BTScenario, allowing us to model action sequences of the scenarios. We then use reinforcement learning to search for combinations of critical action parameters. To increase efficiency, we further propose several optimizations, including action masking and replay buffer. We have implemented ECSAS, and experimental results show that it is more efficient than native approaches such as random and combination testing in various nontrivial scenarios.
- Transportation > Ground > Road (0.85)
- Automobiles & Trucks (0.85)
- Information Technology > Robotics & Automation (0.66)