LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning

Open in new window