Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions

Wu, Yihan, Chen, Ruibo, Hu, Zhengmian, Chen, Yanshuo, Guo, Junfeng, Zhang, Hongyang, Huang, Heng

Jun-2-2024–arXiv.org Artificial Intelligence

Language model (LM) watermarking techniques inject a statistical signal into LM-generated content by substituting the random sampling process with pseudo-random sampling, using watermark keys as the random seed. Among these statistical watermarking approaches, distortion-free watermarks are particularly crucial because they embed watermarks into LM-generated content without compromising generation quality. However, one notable limitation of pseudo-random sampling compared to true-random sampling is that, under the same watermark keys (i.e., key collision), the results of pseudo-random sampling exhibit correlations. This limitation could potentially undermine the distortion-free property. Our studies reveal that key collisions are inevitable due to the limited availability of watermark keys, and existing distortion-free watermarks exhibit a significant distribution bias toward the original LM distribution in the presence of key collisions. Moreover, achieving a perfect distortion-free watermark is impossible as no statistical signal can be embedded under key collisions. To reduce the distribution bias caused by key collisions, we introduce a new family of distortion-free watermarks--beta-watermark. Experimental results support that the beta-watermark can effectively reduce the distribution bias under key collisions.

key collision, watermark, watermark key, (14 more...)

arXiv.org Artificial Intelligence

Jun-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Maryland (0.04)
  - California > Santa Barbara County
    - Santa Barbara (0.04)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Natural Language > Large Language Model (0.68)
    - Machine Learning
      - Performance Analysis > Accuracy (1.00)
      - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found