A Granular Study of Safety Pretraining under Model Abliteration

Open in new window