Enhancing LLMWatermark Resilience Against Both Scrubbing and Spoofing Attacks

Jun-15-2026, 19:28:39 GMT–Neural Information Processing Systems

Watermarking is widely regarded as a promising defense against the misuse of large language models (LLMs); however, existing methods are fundamentally constrained by their vulnerability to scrubbing and spoofing attacks. This vulnerability stems from an inherent trade-off governed by watermark window size: smaller windows resist scrubbing better but are easier to reverse-engineer, enabling lowcost statistics-based spoofing attacks. This work expands the trade-off boundary by introducing a novel mechanism, equivalent texture keys, where multiple tokens within a watermark window can independently support the detection. Based on the redundancy, we propose a watermark scheme with Sub-vocabulary decomposed Equivalent tExture Key (SEEK). SEEK achieves a Pareto improvement, enhancing robustness to scrubbing attacks without sacrificing resistance to spoofing.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Jun-15-2026, 19:28:39 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.93)
- Europe (0.67)
- North America > United States
  - California (0.27)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (1.00)
    - Machine Learning
      - Performance Analysis > Accuracy (0.94)
      - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found