Can Safety Fine-Tuning Be More Principled? Lessons Learned from Cybersecurity

Open in new window