Large Language Model Unlearning

Neural Information Processing Systems 

We study how to perform unlearning, i.e. forgetting undesirable (mis)behaviors, on large language models (LLMs).