Anchoring Refusal Direction: Mitigating Safety Risks in Tuning via Projection Constraint

Open in new window