The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Open in new window