SAEMark: Steering Personalized Multilingual LLM Watermarks with Sparse Autoencoders
–Neural Information Processing Systems
Watermarking LLM-generated text is critical for content attribution and misinformation prevention, yet existing methods compromise text quality and require white-box model access with logit manipulation or training, which exclude API-based models and multilingual scenarios.
Neural Information Processing Systems
Jun-14-2026, 06:41:22 GMT
- Technology: