Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks
Shen, Huanming, Huang, Baizhou, Wan, Xiaojun
–arXiv.org Artificial Intelligence
Watermarking is a promising defense against the misuse of large language models (LLMs), yet it remains vulnerable to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling low-cost statistics-based spoofing attacks. This work breaks this trade-off by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a novel watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). It achieves a Pareto improvement, increasing the resilience against scrubbing attacks without compromising robustness to spoofing. Experiments demonstrate SEEK's superiority over prior method, yielding spoofing robustness gains of +88.2%/+92.3%/+82.0% and scrubbing robustness gains of +10.2%/+6.4%/+24.6% across diverse dataset settings.
arXiv.org Artificial Intelligence
Dec-9-2025
- Country:
- Africa > Republic of the Congo
- Brazzaville > Brazzaville (0.04)
- Asia
- Europe
- North America
- Canada (0.04)
- United States
- California > San Francisco County
- San Francisco (0.14)
- Gulf of Mexico > Central GOM (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Iowa (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Oklahoma > Tulsa County
- Tulsa (0.04)
- California > San Francisco County
- Oceania > New Zealand
- North Island > Wellington Region > Wellington (0.04)
- Africa > Republic of the Congo
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Sports
- Football (0.45)
- Technology: