Watermarking Degrades Alignment in Language Models: Analysis and Mitigation

Open in new window