LM-Fix: Lightweight Bit-Flip Detection and Rapid Recovery Framework for Language Models

Tahmasivand, Ahmad, Zahran, Noureldin, Al-Sayouri, Saba, Fouda, Mohammed, Khasawneh, Khaled N.

Nov-6-2025–arXiv.org Artificial Intelligence

Abstract--Bit-flip attacks threaten the reliability and security of Language Models (LMs) by altering internal parameters and compromising output integrity. Recent studies show that flipping only a few bits in model parameters can bypass safety mechanisms and jailbreak the model. Existing detection approaches for DNNs and CNNs are not suitable for LMs, as the massive number of parameters significantly increases timing and memory overhead for software-based methods and chip area overhead for hardware-based methods. In this work, we present LM-Fix, a lightweight LM-driven detection and recovery framework that leverages the model's own capabilities to identify and recover faults. Our method detects bit-flips by generating a single output token from a predefined test vector and auditing the output tensor of a target layer against stored reference data. The same mechanism enables rapid recovery without reloading the entire model. Experiments across various models show that LM-Fix detects more than 94% of single-bit flips and nearly 100% of multi-bit flips, with very low computational overhead ( 1%- 7.7% at TVL = 200 across models). Recovery achieves more than 100 speedup compared to full-model reload, which is critical in edge devices. LM-Fix can handle bit-flips affecting any part of the model's computation, including memory, cache, and arithmetic operations. Evaluation against recent LM-specific bit-flip attacks confirms its robustness and practical value for real-world deployment.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Nov-6-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Middle East
  - Egypt > Cairo Governorate > Cairo (0.04)
- Europe (0.04)
- North America > United States
  - Maryland (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.94)
    - Natural Language > Large Language Model (0.69)
  - Security & Privacy (1.00)