Towards Understanding Sycophancy in Language Models

Open in new window