Goto

Collaborating Authors

 bit flip


Hammering the Diagnosis: Rowhammer-Induced Stealthy Trojan Attacks on ViT-Based Medical Imaging

Latibari, Banafsheh Saber, Nazari, Najmeh, Sayadi, Hossein, Homayoun, Houman, Mahalanobis, Abhijit

arXiv.org Artificial Intelligence

Abstract--Vision Transformers (ViTs) have emerged as powerful architectures in medical image analysis, excelling in tasks such as disease detection, segmentation, and classification. However, their reliance on large, attention-driven models makes them vulnerable to hardware-level attacks. In this paper, we propose a novel threat model referred to as Med-Hammer that combines the Rowhammer hardware fault injection with neural Trojan attacks to compromise the integrity of ViT -based medical imaging systems. Specifically, we demonstrate how malicious bit flips induced via Rowhammer can trigger implanted neural Trojans, leading to targeted misclassification or suppression of critical diagnoses (e.g., tumors or lesions) in medical scans. Through extensive experiments on benchmark medical imaging datasets such as ISIC, Brain T umor, and MedMNIST, we show that such attacks can remain stealthy while achieving high attack success rates about 82.51% and 92.56% in MobileViT and SwinTrans-former, respectively. We further investigate how architectural properties, such as model sparsity, attention weight distribution, and number of features of the layer, impact attack effectiveness. Our findings highlight a critical and underexplored intersection between hardware-level faults and deep learning security in healthcare applications, underscoring the urgent need for robust defenses spanning both model architectures and underlying hardware platforms. In clinical practice, medical imaging plays a central role in detecting, diagnosing, and monitoring a wide range of conditions.


1091660f3dff84fd648efe31391c5524-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for insightful comments. Y our recognition of our work is much appreciated. The longer they are kept, the higher the number of bit flips they will suffer from. This easily results in a high fault rate (e.g. With that said, we consider extending our protection approaches to lower bit width as future work.


SBFA: Single Sneaky Bit Flip Attack to Break Large Language Models

Guo, Jingkai, Chakrabarti, Chaitali, Fan, Deliang

arXiv.org Artificial Intelligence

Model integrity of Large language models (LLMs) has become a pressing security concern with their massive online deployment. Prior Bit-Flip Attacks (BFAs)--a class of popular AI weight memory fault-injection techniques--can severely compromise Deep Neural Networks (DNNs): as few as tens of bit flips can degrade accuracy toward random guessing. Recent studies extend BFAs to LLMs and reveal that, despite the intuition of better robustness from modularity and redundancy, only a handful of adversarial bit flips can also cause LLMs' catastrophic accuracy degradation. However, existing BFA methods typically focus on either integer or floating-point models separately, limiting attack flexibility. Moreover, in floating-point models, random bit flips often cause perturbed parameters to extreme values (e.g., flipping in exponent bit), making it not stealthy and leading to numerical runtime error (e.g., invalid tensor values (NaN/Inf)). In this work, for the first time, we propose SBFA (Sneaky Bit-Flip Attack), which collapses LLM performance with only one single bit flip while keeping perturbed values within benign layer-wise weight distribution. It is achieved through iterative searching and ranking through our defined parameter sensitivity metric, ImpactScore, which combines gradient sensitivity and perturbation range constrained by the benign layer-wise weight distribution. A novel lightweight SKIP searching algorithm is also proposed to greatly reduce searching complexity, which leads to successful SBFA searching taking only tens of minutes for SOT A LLMs. Across Qwen, LLaMA, and Gemma models, with only one single bit flip, SBFA successfully degrades accuracy to below random levels on MMLU and SST -2 in both BF16 and INT8 data formats. Remarkably, flipping a single bit out of billions of parameters reveals a severe security concern of SOT A LLM models.


NAPER: Fault Protection for Real-Time Resource-Constrained Deep Neural Networks

Rajagede, Rian Adam, Santriaji, Muhammad Husni, Fikriansyah, Muhammad Arya, Nuha, Hilal Hudan, Fu, Yanjie, Solihin, Yan

arXiv.org Artificial Intelligence

--Fault tolerance in Deep Neural Networks (DNNs) deployed on resource-constrained systems presents unique challenges for high-accuracy applications with strict timing requirements. Memory bit-flips can severely degrade DNN accuracy, while traditional protection approaches like Triple Modular Redundancy (TMR) often sacrifice accuracy to maintain reliability, creating a three-way dilemma between reliability, accuracy, and timeliness. We introduce NAPER, a novel protection approach that addresses this challenge through ensemble learning. Unlike conventional redundancy methods, NAPER employs heterogeneous model redundancy, where diverse models collectively achieve higher accuracy than any individual model. This is complemented by an efficient fault detection mechanism and a real-time scheduler that prioritizes meeting deadlines by intelligently scheduling recovery operations without interrupting inference. Our evaluations demonstrate NAPER's superiority: 40% faster inference in both normal and fault conditions, maintained accuracy 4.2% higher than TMR-based strategies, and guaranteed uninterrupted operation even during fault recovery. NAPER effectively balances the competing demands of accuracy, reliability, and timeliness in real-time DNN applications. Fault tolerance in real-time systems with limited computational resources, or resource-constrained systems, presents significant integration challenges. These systems have finite computational capabilities that cannot be easily expanded to accommodate redundancy and recovery without substantial trade-offs.


On Jailbreaking Quantized Language Models Through Fault Injection Attacks

Zahran, Noureldin, Tahmasivand, Ahmad, Alouani, Ihsen, Khasawneh, Khaled, Fouda, Mohammed E.

arXiv.org Artificial Intelligence

The safety alignment of Language Models (LMs) is a critical concern, yet their integrity can be challenged by direct parameter manipulation attacks, such as those potentially induced by fault injection. As LMs are increasingly deployed using low-precision quantization for efficiency, this paper investigates the efficacy of such attacks for jailbreaking aligned LMs across different quantization schemes. We propose gradient-guided attacks, including a tailored progressive bit-level search algorithm introduced herein and a comparative word-level (single weight update) attack. Our evaluation on Llama-3.2-3B, Phi-4-mini, and Llama-3-8B across FP16 (baseline), and weight-only quantization (FP8, INT8, INT4) reveals that quantization significantly influences attack success. While attacks readily achieve high success (>80% Attack Success Rate, ASR) on FP16 models, within an attack budget of 25 perturbations, FP8 and INT8 models exhibit ASRs below 20% and 50%, respectively. Increasing the perturbation budget up to 150 bit-flips, FP8 models maintained ASR below 65%, demonstrating some resilience compared to INT8 and INT4 models that have high ASR. In addition, analysis of perturbation locations revealed differing architectural targets across quantization schemes, with (FP16, INT4) and (INT8, FP8) showing similar characteristics. Besides, jailbreaks induced in FP16 models were highly transferable to subsequent FP8/INT8 quantization (<5% ASR difference), though INT4 significantly reduced transferred ASR (avg. 35% drop). These findings highlight that while common quantization schemes, particularly FP8, increase the difficulty of direct parameter manipulation jailbreaks, vulnerabilities can still persist, especially through post-attack quantization.


On the Relationship Between Robustness and Expressivity of Graph Neural Networks

Kummer, Lorenz, Gansterer, Wilfried N., Kriege, Nils M.

arXiv.org Artificial Intelligence

We investigate the vulnerability of Graph Neural Networks (GNNs) to bit-flip attacks (BFAs) by introducing an analytical framework to study the influence of architectural features, graph properties, and their interaction. The expressivity of GNNs refers to their ability to distinguish non-isomorphic graphs and depends on the encoding of node neighborhoods. We examine the vulnerability of neural multiset functions commonly used for this purpose and establish formal criteria to characterize a GNN's susceptibility to losing expressivity due to BFAs. This enables an analysis of the impact of homophily, graph structural variety, feature encoding, and activation functions on GNN robustness. We derive theoretical bounds for the number of bit flips required to degrade GNN expressivity on a dataset, identifying ReLU-activated GNNs operating on highly homophilous graphs with low-dimensional or one-hot encoded features as particularly susceptible. Empirical results using ten real-world datasets confirm the statistical significance of our key theoretical insights and offer actionable results to mitigate BFA risks in expressivity-critical applications.


No Data, No Optimization: A Lightweight Method To Disrupt Neural Networks With Sign-Flips

Galil, Ido, Kimhi, Moshe, El-Yaniv, Ran

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) power a wide range of applications, including safety-critical tasks such as autonomous driving, unmanned aerial vehicle (UAV) navigation, medical diagnostics, and robotics, where real-time decision-making is essential. However, the increasing reliance on DNNs also raises concerns about their resilience to malicious attacks. Ensuring the robustness of DNNs is crucial to maintaining their reliability in such critical applications. In this paper, we expose a critical vulnerability in DNNs that allows for severe disruption by flipping as few as one to ten sign bits, a tiny fraction of the model's parameters. Our method demonstrates how a small number of bit flips, within models containing up to hundred millions of parameters, can cause catastrophic degradation in performance. We systematically analyze and identify the parameters most susceptible to sign flips, which we term "critical parameters."


Crossfire: An Elastic Defense Framework for Graph Neural Networks Under Bit Flip Attacks

Kummer, Lorenz, Moustafa, Samir, Gansterer, Wilfried, Kriege, Nils

arXiv.org Artificial Intelligence

Bit Flip Attacks (BFAs) are a well-established class of adversarial attacks, originally developed for Convolutional Neural Networks within the computer vision domain. Most recently, these attacks have been extended to target Graph Neural Networks (GNNs), revealing significant vulnerabilities. This new development naturally raises questions about the best strategies to defend GNNs against BFAs, a challenge for which no solutions currently exist. Given the applications of GNNs in critical fields, any defense mechanism must not only maintain network performance, but also verifiably restore the network to its pre-attack state. Verifiably restoring the network to its pre-attack state also eliminates the need for costly evaluations on test data to ensure network quality. We offer first insights into the effectiveness of existing honeypot- and hashing-based defenses against BFAs adapted from the computer vision domain to GNNs, and characterize the shortcomings of these approaches. To overcome their limitations, we propose Crossfire, a hybrid approach that exploits weight sparsity and combines hashing and honeypots with bit-level correction of out-of-distribution weight elements to restore network integrity. Crossfire is retraining-free and does not require labeled data. Averaged over 2,160 experiments on six benchmark datasets, Crossfire offers a 21.8% higher probability than its competitors of reconstructing a GNN attacked by a BFA to its pre-attack state. These experiments cover up to 55 bit flips from various attacks. Moreover, it improves post-repair prediction quality by 10.85%. Computational and storage overheads are negligible compared to the inherent complexity of even the simplest GNNs.


Impactful Bit-Flip Search on Full-precision Models

Benedek, Nadav, Levy, Matan, Sharif, Mahmood

arXiv.org Artificial Intelligence

Neural networks have shown remarkable performance in various tasks, yet they remain susceptible to subtle changes in their input or model parameters. One particularly impactful vulnerability arises through the Bit-Flip Attack (BFA), where flipping a small number of critical bits in a model's parameters can severely degrade its performance. A common technique for inducing bit flips in DRAM is the Row-Hammer attack, which exploits frequent uncached memory accesses to alter data. Identifying susceptible bits can be achieved through exhaustive search or progressive layer-by-layer analysis, especially in quantized networks. In this work, we introduce Impactful Bit-Flip Search (IBS), a novel method for efficiently pinpointing and flipping critical bits in full-precision networks. Additionally, we propose a Weight-Stealth technique that strategically modifies the model's parameters in a way that maintains the float values within the original distribution, thereby bypassing simple range checks often used in tamper detection.


Deep-Learning-Based Channel Estimation for Distributed MIMO with 1-bit Radio-Over-Fiber Fronthaul

Bordbar, Alireza, Aabel, Lise, Häger, Christian, Fager, Christian, Durisi, Giuseppe

arXiv.org Artificial Intelligence

We consider the problem of pilot-aided, uplink channel estimation in a distributed massive multiple-input multiple-output (MIMO) architecture, in which the access points are connected to a central processing unit via fiber-optical fronthaul links, carrying a two-level-quantized version of the received analog radio-frequency signal. We adapt to this architecture the deep-learning-based channel-estimation algorithm recently proposed by Nguyen et al. (2023), and explore its robustness to the additional signal distortions (beyond 1-bit quantization) introduced in the considered architecture by the automatic gain controllers (AGCs) and by the comparators. These components are used at the access points to generate the two-level analog waveform from the received signal. Via simulation results, we illustrate that the proposed channel-estimation method outperforms significantly the Bussgang linear minimum mean-square error channel estimator, and it is robust against the additional impairments introduced by the AGCs and the comparators.