Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution

Oct-22-2025–arXiv.org Artificial Intelligence

Multilingual watermarking aims to make large language model (LLM) outputs traceable across languages, yet current methods still fall short. Despite claims of cross-lingual robustness, they are evaluated only on high-resource languages. We show that existing multilingual watermarking methods are not truly multilingual: they fail to remain robust under translation attacks in medium- and low-resource languages. We trace this failure to semantic clustering, which fails when the tokenizer vocabulary contains too few full-word tokens for a given language. To address this, we introduce STEAM, a back-translation-based detection method that restores watermark strength lost through translation. STEAM is compatible with any watermarking method, robust across different tokenizers and languages, non-invasive, and easily extendable to new languages. With average gains of +0.19 AUC and +40%p TPR@1% on 17 languages, STEAM provides a simple and robust path toward fairer watermarking across diverse languages.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-22-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.28)
- North America
  - United States (0.46)
  - Mexico (0.28)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Natural Language > Large Language Model (1.00)
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Performance Analysis > Accuracy (0.68)