Optimizing Adaptive Attacks against Content Watermarks for Language Models

Open in new window