Steering Personalized Multilingual with Sparse

Jun-23-2026, 00:57:24 GMT–Neural Information Processing Systems

Watermarking LLM-generated text is critical for content attribution and misinformation prevention, yet existing methods compromise text quality and require white-box model access with logit manipulation or training, which exclude APIbased models and multilingual scenarios. We propose SAEMARK, an inferencetime framework for multi-bit watermarking that embeds personalized information through feature-based rejection sampling, fundamentally different from logit-based or rewriting-based approaches: we do not modify model outputs directly and require only black-box access, while naturally supporting multi-bit message embedding and generalizing across diverse languages and domains. We instantiate the framework using Sparse Autoencoders as deterministic feature extractors and provide theoretical worst-case analysis relating watermark accuracy to computational budget. Experiments across 4 datasets demonstrate strong watermarking performance on English, Chinese, and code while preserving text quality. SAEMARK establishes a new paradigm for scalable, quality-preserving watermarks that work seamlessly with closed-source LLMs across languages and domains.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Jun-23-2026, 00:57:24 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.46)
- Europe > Austria (0.28)

Genre:
- Research Report > Experimental Study (1.00)
- Overview (0.67)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (1.00)
    - Machine Learning
      - Performance Analysis > Accuracy (1.00)
      - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found