Eliciting Secret Knowledge from Language Models

Open in new window