CARE: Decoding-Time Safety Alignment via Rollback and Introspection Intervention

Open in new window