Extracting Unlearned Information from LLMs with Activation Steering

Open in new window